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INTRODUCTION 

One  of  the  modern  developments 
in  the  educational  world  is  known  as  the  Testing 
""ovement.     Up  to  the  year  1900  practically  the 
only  forms  of  examination  used  v/ere  the  conven- 
tional essay  type  and  the  quiz.     With  the  devel- 
opment of  science  and  experimental  psychology 
came  also  the  study  and  measurement  of  individual 
differences . 

Measurement  is  essential  in  the 
education  process.     Since  learning  takes  place 
more  readily  when  the  results  are  accompanied 
"by  satisfaction,  activity  is  essential  in  learn- 
ing.    One  must,  therefore,  be  able  to  measure 
success  or  failure  and  improvement.  Measure- 
ment in  education  aims  to  do  this. 

Although  it  is  difficult  to 
define  the  exact  lines  of  demarcation,  Monro 
states  there  are  four  distinct  types  of  proced- 
ures employed  in  measuring  school  achievements: 


1.  "onroe,  "/alter  S.     irecting  Learning  in  the 

TTlgh  ."cho'ol  New  YorkT  "Cobble day 
Dor an  S  Company,  Inc.   1927,  p. 492 


(1)  informal  estimating  of  performances,  "both 
oral  and  written;   (2)  written  examinations  of 
the  essay  type;   (3)  written  examinations  con- 
sisting of  exercises  constructed  so  that  the 
marking  of  the  papers  is  highly  objective,  fre- 
quently called  11new  examinations";   (4)  standard- 
ized tests  which  are  dist inguished  from  "new 
examinations"  by  the  norms  or  standards  which 
have  been  determined  for  the  interpretation  of 
the  measures  obtained. 

The  essential  characteristic 
of  the  "new  examination"  is  that  the  exercises 
are  constructed  so  that  only  one  response  is 
correct  and  hence  the  scoring  is  objective 
rather  than  subjective.     "New  examinations'^ 
are  limited  to  those  responses  that  may  be  class- 
ified as  either  right  or  wrong.    They  may  be  used 
to  measure  specific  habits,  including  memorized 
facts,  but  it  does  not  appear  that  they  can  be 
used  to  measure  knowledge,  ideals,  or  attitudes 
directly.     Questions  which  ask  the  student  to 


2.   Ibid,  p.  497 


define,  explain,  discuss,   give  reasons  why, 
compare  are  excluded. 

It  is  extremely  interesting  to 
note  that  as  early  as  1845,  Horace  TTann  laid 
down  eight  advantages^  of  the  objective  test 
over  the  conventional  form  of  essay  test,  which 
are  as  follows: 

1.  It  is  impartial. 

2.  It  is  just  to  the  pupil. 

3.  It  is  more  thorough  than  the 
older  forms  of  examination. 

4.  It  prevents  the  "officious" 
interference  of  the  teacher. 

5.  It  determines  beyond  appeal  or 
gainsay  whether  or  not  the 
subject-matter  has  been  faith- 
fully and  competently  taught . 

6.  It  takes  away  all  possibility 
of  favoritism. 

7.  It  makes  the  information 
obtained  available  to  all. 

8.  It  enables  all  to  appraise  the 
ease  or  difficulty  of  the 
questions . 

However,  the  objective  examina- 
tion has  a  serious  disadvantage  over  the  essay 
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type  examination  in  that  it  does  not  furnish 
the  opportunity  for  self-expression  in  written 
language .4 

While  the  value  of  the  written 
essay  form  of  examination  does  furnish  greater 
opportunity  in  the  use  of  English  composition, 
it  is  somewhat  debatable  as  to  just  how  valuable 
this  sort  of  exercise  is  when  we  consider  first 
that  most  examinations  as  given  are  not  and 
should  not  be  given  as  a  test  particularly  in 
English  composition,  but  rather  as  a  test  in 
knowledge  of  the  subject-matter  covered  during 
a  given  period;  and  second,  the  very  hurried 
and  brief  way  pupils  are  expected  to  record  their 
answers,  frequently  in  outline  form,  when  neither 
teacher  nor  pupil  has  much  if  any  concern  regard- 
ing the  English  used  but  for  getting  the  largest 
number  of  questions  answered  within  the  limited 
length  of  time  at  their  disposal.- 

The  essay  type,  though  highly 
subjective,  should  not  be  abandoned  altogether 

4.  Ode 11,  C.  W.  Traditional  Examinations  and  New- 

Type  Tests    Century  Co.  1928, p.  14 

5.  Ruch,  C.  M.  op.  cit.  p.  7 


but  rather  its  use  might  be  richly  supplemented 
by  a  much  more  extensive  use  of  the  objective 
tests . g 

In  the  commercial  field,  it  is 
possible  to  reproduce  approximately  in  tests 
the  same  situations  and  to  call  for  the  exercise 
of  the  same  abilities  encountered  in  actual  prac- 
tice.    The  desirability  of  measuring  such  intang- 
ible outcomes  as  appreciations,   ideals,  and  so 
forth,  does  not  arise. « 

The  most  commonly  used  types  of 
objective  tests  are  the  true -false,  completion, 
multiple-choice,  and  matching  tests. ^ 

The  best  type  of  objective  test 
for  quick  review  is  the  true -false,  by  which  a 
greaf.  deal  of  ground  can  be  covered  in  a  very 
short  time.g 

It  is  with  the  true- false  type 
of  objective  test  that  this  thesis  deals. 

6.  Odell,  C.    V.  op.  cit.  p.  14 

7.  Odell,  C.  '".  op.  cit.  p.  336 

S.  Tiegs,  E.  W.  Tests  and  Ileasuremen ts  for  Teachers 

Houghton  Mifflin  &  Co.  Hew  York  p. 243 

9.  Ruch,  G.  M.  and  Stoddard,   G.  D.  Tests  and  r/[easure- 

men t s  in  High  School  Instruc  ticn  ,'7orld 
Book  CoT  p.  2'G8 


Now  true-false  .tests  have 
certain  advantages  and  certain  disadvantages 
over  other  types  of  objective  tests  which  may 
be  listed  as  follows  r-^g 
Advantages ; 

1.  Increased  reliability 

2.  Greater  objectivity 

3.  Time  taken  to  gL ve  the  test 
relatively  short 

4.  Time  required  to  score  it 
also  short 

5.  Ground  covered  more  extensive 

6.  Saving  of  pupil's  effort. 
Disadvantages : 

1.  Do  not  measure  the  most  important 
outcomes  of  learning,   such  as  a 
pupil's  attitude  toward  the  sub- 
ject matter,  or  his  originality. 

2.  Their  preparation  is  too  time 
consuming. 

3.  They  provide  no  training  in  organ- 
ization and  expression. 

4.  Some  of  the  tests  appeal  too 
much  to  memory. 

5.  They  permit  guessing. 


10.     Hyde,  R.  E.  Guessing  and  Success  on  the 

True^Palse  Test  Educational  Methods 
V6T7  S:230-TTanuary  1929 
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All  articles  regarding  true- 
false  tests  mentioned  in  the  Educational  Index 
from  January  1929  to  April  1934  were  read  and 
are  summarized  as  follows: 

Dwight  in  his  study^i  presents 
the  psychological  question  in  regard  to  the  un- 
desirability  of  placing  misleading  statements 
before  pupils.     He  says:   "When  true  and  also 
false  statements  are  spread  out  before  the 
students,  is  it  not  possible  that  the  latter 
will  happen  to  arrest  attention  and  fix  impres- 
sions, in  such  a  way  that,  after  the  classroom 
doors  have  been  locked  and  all  have  gone  home, 
it  will  be  the  erroneous  statement  that  will 
continue  to  stare  at  the  pupils   (some  of  them, 
at  least)  or  be  read  as  from  a  blackboard  in 
their  brains?" 

Dwight  states  that  the  majority, 
perhaps  two-thirds  of  the  pupils  are  predominantly 
visualizers.    An  impression  through  the  eje  on 
the  mind  may  be  rapid  and  the  reaction  may  per- 
sist as  a  mental  twist.    At  best,  the  whole 

11.     Dwight,  C.  A.  A.  ".'hat  is  False  About  True  and 

False  Educational  Method 
Vol.  10:557-8  Je  '31 

effect  of  a  true-false  statement  is  that  of  a 
blur,  a  confusion,  a  mystification,  "from  which 
the  escape  mayhap  is  by  means  of  a  guess,  "-^g 

He  claims  that  positive  injury 
is  done  to  the  pupil  inasmuch  as  a  false  state- 
ment has  at  least  as  good  a  chance  to  live  on 
in  the  memory  as  has  a  true  one.     Even  where  the 
student  has  answered  a  negative  to  a  false  state- 
ment and  that  too  not  as  a  mere  guess,   the  effect 
of  its  visualization  may  in  subtle  and  confus- 
ing ways  or  degrees  remain. 

Dwight  further  states  that,  if 
his  argument  be  correct,  the  difficulty  cannot 
be  wholly  removed  by  a  propounding  of  the  state- 
ments orally,  since  about  one-third  of  pupils 
are  audiles. 

Jersild-^5  is  of  the  same  opinion 
as  Dwight  in  regard  to  this  aspect  of  the  true- 
false  test.     He  states  in  his  article:   "It  appears 
that  the  true-false  test  is  of  dubious  value  as 
a  pedagogical  instrument  (only  insofar  as  the 

12.  Ibid 

13.  Jersild,  A.  T.  Examinati on  as  an  Aid  to  Learning 

Journal  of  Ecfuc  aTi  on  a  1~T  sychology 
Vol.  20:602-9  NOT.  1929 


test  should  serve  as  an  aid  to  learning.)"  He 
coes  on  to  say  that  on  purely  theoretical  grounds 
it  is  open  to  two  serious  charges.     First,  a 
test  of  this  kind,  presenting  as  it  does  a  ran- 
dom and  unpredictable  intermingling  of  true  and 
false  propositions,  may  have  just  as  much  the 
effect  of  perpetuating  error  as  of  strengthening 
proper  associations  and  stimulating  wholesome 
curiosity.     Each  statement  calls  for  a  categori- 
cal true  or  false.     Whether  the  response  is 
right  or  wrong,  the  mere  act  of  putting  a  stamp 
of  affirmative  or  denial  on  a  given  statement 
has  the  effect  of  strengthening  the  association 
so  formed. 

Another  shortcoming  he  attributes 
to  the  true-false  test  is  the  fact  that  it  does 
not  make  strong  demands  upon  the  industry  of 
the  examinee.     He  states  that  "It  is  more  nearly 
a  test  of  passive  recognition  than  of  active 
recall."     "In  responding  to  a  true-false  test", 
he  continues,  "the  student  is  not  called  upon 
to  organize  his  knowledge  or  to  reduce  it  to 
systematic  statement  with  proper  emphasis  in 
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the  most  significant  details." 

He  concludes  that  a  direct 
interrogation  constitutes  a  more  intense  stimulus 
than  does  a  narrative  statement  and  will,  accord- 
ingly, give  rise  to  a  more  lively  response,  and 
that  an  examination  serves  as  an  aid  to  learning 

insofar  as  it  puts  this  principle  to  a  practical 
account  by  stimulating  the  industry  of  the  learner. 

Contrary  to  the  above  opinions 
are  those  of  Arnold-^  who  states  that  the  exper- 
ience resultant  from  the  taking  of  true- false 
tests  should  aid  in  developing  the  habit  of 
asking  in  regard  to  every  statement,  "Is  this 
true?",     ^e  fact  that  this  type  of  test  creates 
doubt  is  not  a  fault,  he  claims,  but  a  merit, 
because  one  of  the  dangers  besetting  air  form 
of  government  is  "the  effectiveness  of  fallacious 
and  insidious  nropaganda. "^^    A  method  which 
would  result  in  a  decreased  tendency  to  believe 
whatever  is  in  printed  form,  la  from  that  stand- 
point to  be  commended.     Thus  it  would  seem  that 
the  "true-false"  idea  in  testing  has  an  intrinsic 

14.  Arnold,  H.  L.  Defense  of  the  True-^s Is  e  Test 

California  Quarterly  Secondary 
Education  4:145-6  Ja.  '29 

15.  Ibid  
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value  in  training  for  citizenship.     The  setting 
in  this  type  of  test  is  that  of  a  life  situation. 
Everyone  is  required  constantly  to  face  questions 
involving  a  decision  of  "true"  or  "false". 

'.''hile  pupils  sometimes  guess  in 
giving  responses  to  this  type  examination,  it 
should  not  he  forgotten  that  pupils  sometimes 
guess  when  taking  other  types  of  tests.^g 

As  each  pupil  makes  a  decision 
on  successive  items  in  a  trus-false  test ,  his 
responses  might  be  separated  into  three  rather 
distinct  categories : ^ 

First,  he  may  he  said  to  have 
exact  knowledge  of  an  item  when  he  is  absolutely 
sure  of  the  answer: 

Second,  he  may  be  said  to  have 
part  knowledge  of  an  item  when  he  knows  something 
about  it  but  is  not  positive  of  the  answer: 

Third,  there  are  a  number  of 
items  of  which  the  student  is  certain  that  he 
has  no  knowledge  and.  any  answer  here  made  would 

16.  Gdell,  C.  '7.  op.  cit .   p.  15 

17.  llelbo ,  I.  R.  How  much  do  Students  Guess  in 

Taking  Ture-False  T^xaminat iori s 
'Ed.  methods  12:495-7  My  '35 

probably  be  a  pure  guess. 

The  typical  "do  not  guess" 
instructions  define  guessing  as  "any  response 
made  without  a  better  basis  than  pure  chance." 

IteTbO  in  his  study  defines 
"guessing"  to  mean  "any  situation  wherein  the 
student  taking  the  test  is  definitely  sure  he 
knows  nothing  at  all  about  the  true-false  item 
under  consideration,  and  that  any  answer  he  may 
give  is  just  a  'pure  guess'  with  an  equal  (fifty- 
fifty)  chance  of  being  either  right  or  wrong." 

On  this  basis,  a  uniform  direc- 
tion sheet  for  use  in  connection  with  all  true- 
false  tests  was  prepared.     A  true-false  test  of 
fifty  items  was  given  to  a  class  of  twenty-three 
college  students  taking  a  course  in  elementary 
sociology.     .Vhen  this  class  claimed  exact  knowl- 
edge, their  responses  were  correct  68;>  of  the 
time.    When  guessing  was  indicated,  their  responses 
were  only  59Jj»  correct.    These  preliminary  results 
verified  the  theoretical  assumptions  with  the 
exception  that  the  use  of  the  new  directions 
for  indicating  states  of  knowledge  may  have 


affected  the  reliability  of  the  tests.  To 
clear  this  point,  two  different  tests  with 
equivalent  forms  were  devised.    The  new  direc- 
tions were  used  with  only  one  form.    The  tests 
were  given  to  students  in  each  of  bhe  various 
levels  in  both  the  high  school  and  college 
departments  of  New  Mexico  State  Teachers  College. 
A  total  of  67,770  responses  from  1,480  different 
test  papers  were  tabulated. 

The  findings  were  as  follows: 

1.  Students  guess  14,55%  of 
the  time,  use  part  knowledge  almost  46%  of  the 
time,  and  use  exact  knowledge  on  nearly  50%  of 
the  true-false  items; 

2.  '.'hen  students  claim  to  know 
the  answer  exactly,  they  are  actually  right  about 
81%  of  the  time,  for  part  knowledge  72^  right, 
and  for  pure  guessing  the  chances  are  as  585 

to  415  that  the  answer  will  be  right  : 

3.  Students  get  about  77%  of 
their  total  true-false  items  right  and  about  25% 
wrong;  of  the  total  number  of  rights,  56%  come 
from  items  of  which  the  students  have  exact 
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knowledge,  and  only  11^  from  pure  guess;  of 
the  tobal  number  of  wrongs,  29^  come  from  exact 
knowledge,  44;t  from  part  knowledge  and  26"?  from 
pure  guess. 

The  coefficient  of  correlation 
between  the  tests  containing  the  new  directions 
and  those  containing  the  usual  directions  was  .93 

Krueger-^g  conducted  his  study  to 
determine  experimentally  (1)  the  distributions 
of  frequencies  based  on  the  number  of  correct 
guesses  in  "true-false  tests"  of  various  lengths, 
and  (2)  to  find  a  practical  length  which  would 
eliminate  the  probability  of  getting  a  high  score 
by  chance  guessing. 

He  found  a  definite  and  obvious 
trend  indicating  that  the  longer  the  test  is. 
the  greater  is  the  frequency  of  scores  within 
the  class  intervals  ranging  from  41%  to  60%  of 
the  number  of  items  in  the  test.    For  the  longer 
tests,  practically  all  scores  ranged  within  45$ 
and  55%  of  the  total  test.     He  also  found  that 
chance  guessing  may  frequently  yield  very  high 

18.  Xrueger,        C.  F.  Distribution  of  Scores  Based 

on  Correct  Guessing  for  True-False  Tests 

o7  /arious  Lengths  J.  Ed.  Psy.  24:185-3  Mr.» 
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and  very  low  scores  in  short  tests.     In  the 
longer  tests  this  probability  is  practically  elim- 
inated . 

Kreuger-^g  also  planned  an  experi- 
ment to  suggest  answers  bo  the  following  problems: 

1.  Will  a  person  write  "true" 
more  often  than  "false"  when  he  guesses  his 
answers  at  random  in  a  true-false  test?  What 
proportion  of  these  guesses  are  incidentally 
guessed  correctly? 

2.  When  a  person  is  limited  to 
fifty  per  cent  of  the  guesses  as  "true"  and 
fifty  per  cent  of  the  guesses  as  "false"  what 
proportion  of  the  guesses  will  happen  to  be 
correct? 

A  list  of  100  words,  100  syllables, 
and  ICO  numbers  were  presented  to  some  one  hundred 
three  students.     The  group  was  informed  that 
later  the  instructor  would  read  to  them  a  series 
of  words,   syllables  and  numbers  selected  from  the 
tests  before  them.    They  were  to  select  or  guess 
which  of  the  items  the  experimenter  had  selected. 

19.  Krueger,  W.  ?.     x peri mental  Study  of  Certain  Phases 

of  8.  True -False  Test  Journal  Educa- 
tlonal  Psychology  23:31-91  F'32 


It  was  found  that  the  average 

number  of  items  guessed  as  "true"  was  as  follows: 

V/ords  51.10$ 
Syllables  50.90^ 
Numbers  51.00$ 

Number  of  items  guessed  correctly: 

V/ords  49.87$ 
Syllables  50.02$ 
Numbers  49.75$ 

It  is  noted  that  "true"  was  written 
after  slightly  more  than  fifty  per  cent  of  the 
items.    The  frequency  of  correctly  guessed  items, 
when  checked  by  a  key  of  fifty  true  and  fifty 
false  items,  averaged  almost  fifty  for  the  three 
lists . 

In  the  next  step  to  the  experiment, 
the  subjects  were  directed  to  limit  their  guesses 
of  "true"  to  fifty  and  their  guesses  of  "false" 
to  fifty.    The  same  lists  as  used  in  the  first 
experiment  were  given  and  the  subjects  still  were 
forced  to  guess  since  they  had  no  information  upon 
which  to  base  their  decisions.    The  average  fre- 
quency of  correctly  guessed  items  was  50.27$, 
50.03$,  and  50.23$  respectively  for  the  three 
typos  of  material.     Incidentally,  if  the  right 
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minus  wrong  formula  had  been  used  the  average 
scores  would  be  approximately  zero. 

In  connection  with  this  phase  of 
guessing  in  true-false  tests,  it  is  interesting 
to  note  the  experiment  of  Brinkmeier  and  Keys2o 
inspired  by  their  convictions  that  the  amount  of 
guessing  which  takes  place  in  the  average  true- 
false  examination  is  much  greater  than  is  commonly 
realized  even  by  those  who  are  being  examined,  and 
that  many  of  the  statements  in  such  examinations 
can  be  recognized  as  true  from  their  form  and 
nature,  apart  from  any  knowledge  of  the  parti- 
cular subject  mat Jj or  involved. 

It  is  agreed  that  certain  words 
or  phrases  in  true-false  statements  often  serve 
as  "specific  determiners"  giving  fairly  dependable 
cues  to  the  correct  response. 

An  overwhelming  proportion  of 
statements  containing  the  words  "all",  "always", 
"only",  "no",  "never",  and  "none"  will  be  false, 
while  statements  qualified  by  "most",  "some", 
"probably",  "may",  "often",  and  the  like  are  true. 

20.  Brinkmeier,  I.  H.  and  Keys,    :.  C ir cu ms t ant i ali t y 

as  a  Factor  in  Guessing  on  True -False 
Exami  nations. J  ourna 1  Educ  a  t  i  o  n  al 
Psychology  "2l ;  681-94  D.  *50 


Pupils  soon  become  aware  of  these  peculiari- 
ties of  true-false  tests  and  there  is  little 
doubt  that  much  of  the  practice  effect  which 
gives  so  distinct  an  advantage  to  the  "test- 
wise"  pupil  must  be  attributed  to  increasing 
sensitivity  to  cues  of  this  sort. 

The  experimenters  had  access 
to  376  objective  examinations  submitted  in  a 
nation-wide  prize  contest.    These  examinations, 
assembled  from  thirty-one  states,  covered  all 
the  principal  departments  of  high  school  in- 
struction, although  slightly  more  than  half  dealt 
with  English  and  the  Social  Studies.  They 
included  a  total  of  10,756  true-false  statements. 
Inspection  of  these  last  convinced  the  writer  that 
more  than  one-fifth  of  the  statements  were  of 
3uch  a  nature  that  their  trufcb  or  falsity  might 
be  correctly  inferred  by  an  intelligent  and  "test- 
wise"  reader,  regardless  of  his  knowledge  of  the 
subjects.     After  eliminating  the  false  statements 
in  this  list  and  all  true  items  containing  one 
or  more  cue  words  and  phrases  of  the  specific- 
determiner  variety,  there  remained  one  hundred 
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statements  the  truth  of  which  might  be  regarded 
as  to  some  extent  self-evident.    From  this  number 
a  random  sampling  of  fifty  was  drawn.     The  qual- 
ity which  these  questions  had  is  perhaps  best 
described  as  a  "certain  circumstantiality  of 
content  and  phrasing,"  - 

The  following  statements  which 
were  taken  from  the  list  will  serve  to  illustrate 
this  quality: 

"Democracies  need  thinkers  who 
will  cooperate  in  the  solution  of  problems.1' 

"Johnson  v:as  a  great  scholar 
and  the  most  important  member  of  the  famous 
Literary  Club." 

There  is  also  a  strong  presump- 
tion on  the  part  of  students  to  the  effect  that 
long  statements  in  a  true-false  examination  will 
be  true  ones;  e.g.   "In  Virginia,  the  growing  of 
tobacco  led  to  the  occupation  of  large  tracts 
of  land  and  made  impossible  the  establishment 
of  the  town  with  its  local  democratic  features 
of  meeting  house  and  public  school." 

Furthermore,  when  length  takes 
the  form  of  a  cataloging  of  details,  this  pre- 
sumption becom3S  a  certainty;  e.g.  "The  first 

21.  Ibid 

quarter  of  the  nineteenth  century  saw  the 
beginnings  of  a  true  literature  in  the  depart- 
ment of  poetry,  fiction,  and  belles-lettres." 

In  addition  to  the  fifty  state- 
ments, a  second  list  of  twenty-five  statements 
of  the  experimenters'  own  construction  was  pre- 
pared.    These  were  modeled  closely  upon  state- 
ments in  the  first  list,  and  all  shared  the 
common  characteristic  that  they  would  assuredly 
be  branded  as  false  by  one  fully  cognizant  of 
the  facts. 

The  fifty  true  statements  and  the 
twenty-five  false  statements  were  thrown  together. 
To  offset  in  part  the  number  of  obviously  true 
stabments,  the  experimenters  next  selected  from 
the  examinations  submitted  twenty- five  additional 
items  regarded  as  obviously  false.     Typical  of 
these  last  were  such  statements  as  "Daniel  Boone 
was  one  of  our  presidents."     "Hatcheries  are 
built  to  destroy  fish."    The  entire  100  were 
then  intermingled  in  chance  order  and  mimeographed 
under  the  title  of  a  "General  Information  Test". 

The  test  was  submitted  to  three 


classes  of  high  school  pupils  and  to  a  group  of 
one  hundred  senior  and  graduate  students  at  the 
University  of  California.     Each  statement  was 
to  be  marked  true  or  false  on  its  own  r.erits. 
In  case  Vie  student  did  not  know  the  answer  to 
a  given  statement  he  was  instructed  to  guess, 
but  any  answer  in  which  guessing  was  necessary 
was  to  be  indicated  by  a  question  mark  placed 
after  the  plus  or  minus  sign.     Students  were 
also  assured  that  their  answers  would  in  no  way 
affect  their  class  marks.    Twenty- five  minutes 
were  allowed  for  marking  the  statements,  which 
proved  ample  for  practically  all  members. 

Responses  to  each  of  the  one 
hundred  t93t  items  were  then  tallied  separately 
according  to  whether  the  statement  was  marked 
true  or  false,  and  whether  guessing  had  been 
indicated  by  a  question  mark. 

Each  of  the  fifty  obviously 
true  statements  received  a  clear  majority  of 
"Trues"  ranging  from  98.9$  to  56.1$ 

The  study  showed  that  students 
who  believed  that  they  knew  the  answer  to  these 


statements  were  right  only  81$  of  the  time, 
while  those  who  admitted  they  were  guessing  were 
correct  in  10%.     Such  figures  go  far  to  substan- 
tiate the  impression  that  statements  of  the  type 
of  the  fifty  obviously  true  statements  bear  too 
many  surface  indications  of  the  reply  expected, 
and  are  poorly  adapted  to  distinguish  between 
the  well-informed  and  the  merely  shrewd  pupil. 

In  the  total  replies  to  the 
false  statements,  however,  the  "Trues"  outnum- 
bered the  "Falses"  by  approximately  3  to  2. 

While  superior  knowledge  enabled 
the  university  students  to  outdo  the  high  school 
pupils  in  the  number  of  statements  recognized 
as  false,  the  general  effect  of  circumstantiality 
of  form  in  suggesting  the  response  of  "true"  was 
evidently  much  the  same  for  both  groups.  It 
also  follows  that,  in  addition  to  being  careful 
in  the  use  of  cue  words  or  phrases  which  act 
as  "specific  determiners"  of  pupils'  responses, 
it  would  seem  best  to  eliminate  as  far  as 
possible  the  types  of  true  statements  which  bear 
so  many  surface  indications  of  their  verity  as  to 


be  of  little  value  for  purposes  of  measurement. 

To  determine  the  exact  relation- 
ship which  it  was  felt  existed  between  the  word- 
length  of  statements  and  their  truth  or  falsity, 
data  were  derived  by  Brinkmeier22  from  true-false 
statements  included  in  the  376  examinations  pre- 
viously mentioned  as  entered  by  teachers  in  the 
national-wide  contest  in  the  construction  of 
objective  examinations  conducted  by  Drs.  G.  M. 
Ruch  and  G.        Rice.     Again  6,671  statements  were 
made  available  for  this  study. 

Each  statement  was  recorded  as 
true  or  false  as  the  case  happened  to  be,  like- 
wise the  number  of  words  contained  in  those 
statements  were  tabulated.     A  frequency  distri- 
bution table  was  constructed  showing  the  grouping 
of  the  word-lengths  of  statements  in  step-intervals 
of  five,  the  frequency  of  true  and  false  state- 
ments in  each  group,  the  percentage  of  true  and 
false  statements  and  the  probable  error  of  these 
percentages. 

Of  the  6,671  statements  analyzed 
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51. 3i  were  true  statements,  and  32,4>£  were  fals 

Of  these  statements,  65.8/0  were 
composed  of  from  6  to  15  words.  In  other  words 
about  tv/o-thirds  of  all  the  statements  were  com 
posed  of  fifteen  or  less  words. 

Of  4,773  statements  containing 
from  3  to  15  words,  48.3^  were  true  and  51.7$ 
false.     That  is  all  statements  composed  of  15 
words  or  less  tend  to  be  false  as  often  as  true 

Long  statements,  that  is,  those 
composed  of  from  20  to  25  words  were  found  to  b' 
true  in  almost  66 %  of  the  cases,  while  those 
composed  of  more  than  2  5  words  tend  to  be  true 
in  about  80'^  of  the  cases. 

A  probable  explanation  of  the 
tendency  for  long  statements  to  be  true  is  that 
teachers  in  attempting  to  construct  true -false 
statements  that  may  be  defended  as  definitely 
either  true  or  false,  add  dependent  phrases  or 
clauses.     The  evidence  indicates  that  those 
dependent  phrases  and  clauses  tend  to  erase  any 
possible  falsity  of  the  statements. 

It  is  frequently  said  it  Is 
better  to  record  the  first  answer  that  comes  to 
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the  mind  in  taking  a  true-false  test  than  to 
stop  to  meditate  or  reason  on  the  question.  This 
idea  is  based  on  the  fact  that  true-false  tests 
are  in  large  measure  recognition  tests,  and 
that  the  first  response  to  an  idea  is  sometimes 
more  reliable  than  the  response  that  occurs 
after  mature  reflection. 

In  an  experiment  performed  by 
Lowe  and  Crawford^  tv/o  types  of  procedure  were 
used. 

One  was  the  actual  tabulation 
of  specific  changes  of  answers  in  true- false 
test  papers  answered  under  ordinary  circumstances 
v/ithout  any  thought  of  such  an  investigation  on 
the  part  of  the  students  involved.     This  proce- 
dure assumed  that  any  answer  that  was  not  changed 
was  a  "first  impression"  answer,  and  that  any 
ansv/er  that    as  changed,  was  a  "second  thought" 
answer.     This  is  not  necessarily  a  safe  assump- 
tion since  many  unchanged  answers  are  "second 
thought"  answers  which  were  not  written  down  in 
both  forms. 

23.  Crawford,  C.C.  and  Lowe,  M.  L.  First  Impression 
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The  second  type  of  procedure 
used  was  designed  to  correct  this  weakness.  A 
large  class  was  divided  into  Groups  A  and  B. 
A  true-false  test  was  prepared  in  two  parts,  I 
and  II.     Each  test  was  mimeographed  so  that  two 
spaces  were  allowed  for  answering  each  question. 
The  first  space  was  for  "first  impression"  and 
the  second  was  for  "second  thought."    Group  A 
was  asked  to  take  Test  I  by  answering  rapidly 
all  questions  in  the  "first  impression"  space 
and  then  to  return  and  answer  each  with  more 
mature  reflection  in  the  "second  thought"  space 
reversing  the  previous  decision  wherever  desired. 

At  the  same  time,  Group  B  was 
asked  to  take  Test  I  by  a  "delayed  answer"  form 
of  second  thought.     This  consisted  of  reading 
all  questions  over  without  answering  any  of  them 
and  then  returning  to  answer  each  in  the  "second 
thought"  space  on  the  sheet.     This  was  to  prevent 
the  record  of  the  first  impression  from  influenc- 
ing the  final  second  thought  decision.  After 
Test  I  was  finished  in  this  manner  each  group 
took  Test  II,  but  the  methods  were  reversed. 


Data  secured  showed  a  definite 
superiority  of  changes  from  wrong  to  right  over 
changes  from  right  to  wrong,  with  the  ratio  "being 
almost  exactly  two  to  one   for  the  total  number  of 
change  s . 

No  significant  advantage  is  shown 
ir,  having  each  student  read  the  questions  over 
"before  answering  them,  since  the  "delayed  answer" 
scores  were  almost  exactly  equal  to  the  "first 
impression"  scores. 

A  very  important  factor  in  this 
experiment  is  the  amount  of  changing  of  answers 
which  took  place  when  the  students  were  asked  to 
record  their  "second  thought"  decisions.  There 
were  two  hundred  forty-seven  changes  out  of  a 
total  of  2,416  answers  or  almost  exactly  ten 
per  cent.     In  other  words,  "first  impression" 
decisions  were  still  clung  to  in  nine  out  of 
ten  cases  on  more  mature  consideration. 

To  neu':  Tlize  the  effect  of 
guessing,  the  right  minus  wrong  formula  has 
been  approved  by  a  number  of  writers,  who  have 
varyingly  recommended  that  the  examinee  either 
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be  or  not  be  encouraged  to  guess  the  correct 
answer.     Barton.p4  states  in  his  study  that  he 
has  always  been  skeptical  about  the  validity  of 
the  right  minus  wrong  formula  because  it  seemed 
to  assume  that  every  item  wrongly  judged  was 
very  likely  to  be  a  result  of  guessing.  He 
also  disapproved  of  the  formula  because  many 
students  have  the  feeling  that  their  scores  do 
not  truly  represent  their  actual  achievement  on 
a  t est . 

The  following  directions  were 

subsequently  used  in  giving  Went:/- five  true- 

false  tests  to  students: 

"If  you  think  a  statement  is 
true,  write  a  plus  sign  in  the  blank  printed 
before  it. 

If  you  think  a  statement  is 
false,  write  a  minus  sign  in  the  proper  blank, 
and  then  draw  a  line  through  the  word  or  the 
words  that  make  the  statement  false. 

Omitted  items  will  not  be  given 

any  credit. 

A  blank  containing  both  a  plus 
and  a  minus  sign  will  be  scored  as  wrong. 

To  get  credit  for  judging  a 
false  item,  you  must  draw  a  line  through  the 
7/0 rd  or  the  words  that  make  it  false. 


24.  Barton,  Jr.,  '7.  A.   Improving  the  True  -False 
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Your  score  will  be  the  number 
of  items  you  judge  correc  tly .  "gg 

The  above  directions  provide  a 
cross -out  method  for  indicating  the  reason  for 
judging  an  item  to  be  false. 

For  each  of  these  tests  the 
reliability  was  computed,  both  when  the  right 
minus  wrong  formula  was  used  and  also  when 
credit  was  allowed  for  each  correctly  judged 
item.    The  data  showed  an  actual  difference 
of  .11  in  favor  of  the  coefficient  of  correla- 
tion when  the  student  was  allowed  credit  for 
every  item  judged  correctly.     Examination  of 
the  table  showed  that   in  only  four  out  of  the 
twenty-five  cases  was  the  correlation  greater 
when  the  right  minus  wrong  method  of  scoring 
was  used. 

It  was  concluded  by  Barton  that 
the  cross-out  method  has  the  following  advantages: 

1.  It  probably  reduces  guessing 
to  a  minimum  in  judging  true-false  items; 

2.  It  has  specific  diagnostic 
value  in  determining  the  student's  comprehension 
of  the  test  items; 
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3.  It  ma  kes  true-false  tests 
a  task  of  reasoning  as  well  as  of  recall;  more- 
over time  consumed  in  taking  tests  by  this  method 
is  uniformily  greater  because  of  the  seriousness 
with  which  the  students  judge  the  items; 

4.  Reliability  of  short  tests 
is  greatly  increased  by  this  method; 

5.  When  this  method  is  used  it 
is  unnecessary  for  the  test  constructor  to  make 
the  true-false  items  equal  in  number.     For  this 
reason  artificiality  of  wording  can  be  greatly 
reduced  and  the  student  will  have  no  reason  to 
check  over  his  test  to  see  whether  he  has  judged 
as  many  items  false  as  true. 

It  would  appear  likely  that  this 
training  in  discovering  the  real  basis  of  falsity 
in  statements  should  develop  many  critical  readers 
unusually  adept  in  perceiving  the  real  meaning 
of  whatever  they  read. 

Paterson  and  Langliegg  gave  a 
100-item  true-false  test  on  the  psychology  of  ad- 
vertising to  one  hundred  eleven  students.  They 
found  a  reliability  of  .63  when  the  papers  were 

26.  Paterson  D-  G.  and  LangLie.  T.  A.  Empirical  Data 
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Tnnrnnl    Ann"Hpirl    PrnrnViftT           Q.  "LQP5  

 V-      ttl  1  IW  ■!,          X  J-                            ^tf*~* VJv  "  £      A,  &W»  O  ; 

marked  right  minus  wrong. 

They  state  that  "Hence,  the 
a  ssumption  that  the  right  minus  wrong  method 
is  more  reliable  than  the  number  right  method 
of  scoring  true-false  tests  is  seriously  questioned 
by  these  facts."  „ 

Woodgs  studied  true-false  tests 
in  several  college  subjects.     The  directions 
given  in  all  cases  were  "do  not  guess".  He 
says  on  Page  8,  "In  no  case  does  the  number 
right  score  suffer  by  comparison  with  the  right 
minus  wrong  score,  and  in  only  one  case  does 
the  right  minus  wrong  compare  at  all  favorably 
with  the  number  right  as  to  reliability,  "pg 

The  discovery  of  a  method  of 
marking  true-false  tests  by  which  it  was  believed 
one  could  consistently  get  right  more  than  half 
of  the  questions  one  did  not  know,  led  to  a 
study  by  Dunlop  and  others. 3q 
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The  method  consisted  essentially 
in  narking  the  known  items,   counting  the  number 
of  items  marked  true  and  the  number  marked  false 
and  marking  all  the  rest  in  the  same  manner  as 
the  lesser  number  counted,  on  the  assumption 
that  a  properly  constructed  true -false  test 
contains  an  equal  number  of  true  and  false  state- 
ments. 

Tv/o  24-item  forms  of  a  yes-no 
test  designed  to  measure  reading  comprehension 
were  combined  into  a  single  48-item  list.  These 
tests  had  been  carefully  standardized.     The  mem- 
bers of  each  pair  of  items  were  of  equal  difficulty. 
Each  list  was  arranged  in  order  of  difficulty  and 
consisted  of  an  equal  number  of  true  and  false 
items,  the  average  difficulty  of  the  true  ones 
being  equal  to  the  average  difficulty  of  the 
false  ones.     The  combined  list  was  constructed 
by  taking  items  alternately  from  each  of  these 
lists. 

The  test  was  administered  to 
79  second-year  students  of  the  Territorial  Normal 
School,  Honolulu.    The  students  were  separated  into 


three  groups.    The  test  was  p;iven  to  each  group 
three  times  in  immediate  succession,  each  time 
with  a  different  set  of  directions.     Groups  were 
numbered  I,  II,  and  III  and  trials  identified  as 
A,  B,  and  C. 

Group  I  was  given  the  trials 
in  the  order  of  A,  B,  and  C;  Group  II — B,  0,  and 
A;  Group  III — C,   A,  3. 

Although  a  time  limit  was  used, 
this  was  made  so  liberal  that  not  a  single  student 
failed  to  finish  any  of  the  trials. 

The  directions  which  differed 
in  each  of  the  trials  were  as  follows: 

"Trial  A:     In  this  trial,  answer 
each  question  you  absolutel:/  know.     Do  no  t  guess. 
Leave  all  the  rest  blank." 

"Trial  B:     In  this  trial,  answer 
every  question  as  you  come  to  it.     Do  not  leave 
any  ovit.     If  you  do  not  know  the  answer  to  a 
question,  guess.     Answer  each  question  before 
starting  the  next." 

"Trial  C:     In  this  trial,  answer 
each  question  you  absolutely  know.     Then  count 
your  Yes's  and  No's.     If  you  have  fewer  Yes's 
than  TIo's,  mark  all  the  rest  of  the  questions 
Yes.     If  you  have  fewer  No's  than  Yes's  mark  all 
the  rest  of  the  questions  No.     Since  the  test 
has  an  equal  number  of  true  and  false  statements, 
you  will  then  make  a  higher  score  than  you  would 
be  likely  to  make  by  guessing.     If  you  have  an 


equal  number  of  Yes's  and  No's  mark  all  the  rest 
Yes."-. 

Trial  A  was   scored  by  both  the 
number  right  and  the  right  minus  wrong  formula 
methods;  the  other  trials  were  scored  by  the 
former  method  only,  the  odd  and  even  questions 
being  scored  separately. 

Trial  A  showed  a  significantly 
higher  reliability  than  Trial  B.     (Do  not  guess 
me  t  hod  wit  h  £U  e  s  s  me t  hod . ) 

Trial  A  showed  a  somewhat  higher 
reliability  than  Trial  C  but  the  difference  was 
not  statistically  significant.     (Do  not  guess 
method  with  fill-in  method.) 

Trial  C  was  significantly  more 
reliable  than  Trial  B.   (Fill-in  method  with 
guess  me  thod. ) 

Thi s  s  feud y  co no lud e d  t ha t  the 
directions  to  guess  lowers  the  reliability;  the 
directions  not  to  guess  gives  a  spuriously  high 
reliability.    Under  any  set  of  directions  which 
cause  all  students  to  mark  all  questions,  the 
number  right  method  of  marking  may  be  used  instead 
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of  the  right  minus  wrong  method,  with  a  conse- 
quent gain  in  speed  and  accuracy  of  scoring. 

The  authors  of  this  experiment 
believed  they  were  warranted  in  concluding  that 
the  new  directions,  under  classroom  conditions, 
would  probably  result  in  a  higher  reliability 
than  other  directions,  and  have  the  added  advan- 
tage that  they  might  be  scored  by  the  number 
right  method. 

It  is  extremely  interesting  to 
note  that  Whidden  and  Navies  of  Yale  University.^ 
hold  that  ""/hatever  method  of  scoring  a  true -false 
examination  may  be  used,  the  method  of  scoring 
by  the  total  of  right  answers  is  not  satisfactory." 
If  the  number  of  questions  is  at  all  large,  they 
believe  that  a  man  would  be  apt  to  get  half  his 
answers  correct  by  sheer  guessing.     The  method  of 
scoring  a  dopted  at  the  Yale  Law  School  for  its 
true-false  examinations  is  this: 

Preliminary  warning  is  given  that 
guessing  will  be  penalized,  that  if  an  answer  has 
to  be  guessed,  it  had  better  be  omitted  entirely. 

32.  "hidden,  Jr.  ,  C.  TT.  and  Davie  s,  F.  J.  !  let  hod  for 
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The  examination  is  scored  by  adding  the  sum  of 
omitted  answers  to  twice  the  sum  of  wrong  answers. 
The  lowest  score  is  the  best  and  the  highest  is 
worst.    The  theory  underlying  the  method  is  that 
guessing  will  be  greatly  minimized  if  not  entirely 
eliminated  and  that  simple  lack  of  information  on 
a  given  question  is  not  to  be  scored  against  so 
heavily  as  definitely  wrong  information  on  that 
question. 

It  is  felt  that  the  minimizing  of 
guessing  through  the  preliminary  warning  seems  to 
be  accomplished.     Out  of  seven  examinations  given 
in  June,  1933,  it  was  found  on  all  but  one  that 
the  group  of  men  with  the  highest  average  law 
grades  for  the  year  had  a  smaller  proportion  of 
their  examination  scores  accounted  for  by  omitted 
answers  than  did  the  group  of  men  with  the  lowest 
average  law  grades  for  the  year. 

It  is  agreed  that  the  criteria  of 
a  good  test  in  any  subject  are  i bs  validity  and  its 
reliability.    Ho  teacher  is  satisfied  with  an 
achievement  test  which  does  not  cover  all  of  the 
important  items  which  she  had  taught  .     No  pupil 


considers  a  test  good  which  does  not  stress  the 
important  parts  of  the  unit  or  the  course  which 
he  has  completed.     Both  of  these  considerations 
are  matters  of  the  validity  of  the  test.     In  the 
language  of  the  test  expert  the  validity  of  a  test 
refers  to  the  "worthwhileness "  of  the  test. 

"Validity  is  in  general  the  degree 
to  which  a  test  parallels  the  curriculum  and  good 
teaching  practice." 

"A  measuring  instrument  is  said 
to  possess  validity  when  it  measures  what  it  claims 
to  measure. "34 

There  are  two  principal  methods  of 
validation  o  f  te  sts :   (a)  curricular  and  (b)  statis- 
tical . 

A  study  of  the  published  accounts 
of  the  validation  of  existing  achievement  tests 
shows  that  most  of  them  are  of   the  curricular  type. 
According  to  Symond  s,   "In  the  case  of  the  achieve- 
ment teat,  the  independent  criteria  to  be  used  for 
validation  are  few  in  number.     One  must  usually  fall 
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back  on  school  marks  or  teacher  estimates  of 

validation  (of  the  achievement  test)  must  be 
accomplished  in  the  original  choice  of. material 
of  the  test . 

"The  validation  of  a  test  can  be 
no  better  than  the  present  state  of  knowledge 
about  the  objectives,  aims,  minimum  e ssentials . 
social  utility,  etc.  of  the  curricular  content,  "^g 

There  seems  to  be  a  widespread 
assumption  on  the  part  of  achievement   test  con- 
structors and  authorities  that  recall,  multiple 
response  and  true -false  form  of  items  are  suffi- 
ciently equivalent  in  validity  to  justify  indis- 
criminate use  from  the  standpoint  of  validity. 
This  is  indicated  indirectly  by  the  manner  in 
which  the  forms  are  used  in  published  tests  and 
explicitly  by  statements  in  standard  books  in  test 
construction . 

For  instance  Ruch  says:  "'/Mien 
validity  coefficients  are  corrected  for  attenua- 
tion, the  resulting  values  are  high,  showing  that 
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true-false,  multiple  choice  and  recall  tests  mea- 
sure roughly  the  same  abilities''      and  his  recom- 
mendation concerning  the  selection  o  f  it  em  forms 
for  use  in  a  test  include  many  other  considerations 
but  not  that  of  effect  on  validity, 

Odell  says:     "It  is  very  probable 
that  for  particular  bodies  of  subject-matter  and 
for  special  purposes  certain  forms  of  exercises 
yield  more  valid  results  than  do  others,1'  thus 
s  eeming  to  indicate  a  contrary  point  of  view,  but 
he  goes  on  to  say:     "In  general  it  appears  that  at 
least  all  the  more  commonly  used  forms  of  the  new 
examination  differ  so  little  in  regard  to  validity 
that  it  neec  not  be  considered  as  a  factor  in  se- 
lecting the  type  to  be  used." 

oo 

Tiegs  says:  "in  general,  so  far  as 
measurement  techniques  permit  us  to  determine,  true- 
false,  multiple  choice,  and  completion  tests  measure 
approximately  the  same  thing,"  and  in  another  place, 
"Evidence  available  indicates  that  the  three  most 
used  types  of  new- type  tests  are  approximately  equal 
in  validity. " 
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This  assumption,  Hagill^-  states 
in  his  report  made  in  January,  1934,  seems  to  be 
based  on  two  types  of  evidence,  one  the  similarity 
in  size  of  coefficients  of  correlation  between 
alternate  test  forms  made  up  of  test  items  of  the 
various  types  and  certain  criteria  of  validity:  the  , 
other,  high  intercorrelation  between  alternate  test 
forms.     The  writer  of   Ihe  article  goes  an  to  say 
that  he  has  been  skeptical  of  the  assumption  for 
the  following  reasons:   "first,  recall,  multiple 
response  and  true -false  items  apparently  require 
greatly  dissimilar  types  and  degrees  of  recall; 
second,  the  criteria  employed  in  the  studies  of 
comparative  validities  have  been  various  combina- 
tions of  essay  type  examinations,  objective  type 
examinations,  instructors  estimates,  pupils'  esti- 
mates, and  terms  grades,  all  academic  and  question- 
able substitutes  for  the  life  values  which  supposedly 
form  the  objectives  of  present-day  education." 

Magill's  investigation  consisted 
of  three  forms  of  a  miscellaneous  information  test^ 
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of  fifty  items  given  to  two  classes,  nade  up  chiefly 

of  teachers-in-service,  the  first  forty-four  and 

the  second  fifty-four  in  number. 

The  first  form  of  the  test  contained 

the  fifty  items  in  one  word  answer  form  (completion), 

the  second  contained  the  same  items  in  five-response 

form  (multiple  choice),  and.  the  third  the  same  items 

in  true-false  form.     The  three  forms  were  given  in 

the  order  named,  one  immediately  following  the  other, 

and  so  supervised  that  there  was  no  opportunity  for 

the  subjects  to  learn  the  answers  during  the  test 

period  (other  than  through  the  incidental  practice 

effect  of  the  tests  themselves).     The  tests  were 

given  to  class  one  as  speed  tests,  with  limits  of 

seven,  five  and  three  minutes  respective ly  which 

permitted  only  a  few  of  the  most  rapid  to  finish.. 

All  members  of  class  two  were  given  sufficient  time 

to  finish  each  test  and  each,  as  he  finished,  noted 

the  time  of  finishing.     The  median  for  each  of  the 

tests  was  as  follows: 

Recall  8 1 05  minutes 

Five-response  4:30  " 
True -false  2:40  " 

Intercorrelations  between  gross 


scores  were  calculated  and  the  responses  of  each 
subject  to  each  item  of  the  three  forms  were  com- 
pared to  determine  the  number  of  inconsistencies 
of  response;  i.e.,  responses  correct  on  one  form 
and  incorrect  on  another. 

To  secure  evidence  regarding  the 
influence  of  corrections  for  chance  upon  intercor- 
relations,  the  intercorrelations  obtained  with 
uncorrected  scores  were  compared  with  those  obtained 
with  the  scores  of  the  multiple  response  and  true- 
false  lists  corrected  by  the  formula  R-W.     If  the 


inconsistencies  were  appreciably  due  to  guessing 
the  intercorrelations  should  be  proportionately 
raised  by  the  corrections  for  chance. 

Class  1    Class  2 

Recall-True-fal  se  uncorrected  .61^.06  .76X04 

"  "        "      R-W     '  .52X07  .84X02 

Recall-Fivc-response  uncorrected       .88X02     .91- ,01 
"  "  R-^W  .85X01  .90^.01 

Five-response  True-false  .60X06  .91^.01 

"     R-1"/      -       ■  R-W  .72X04  .85X02 

It  i  s  noted  that  four  of  the  six 

coefficients  are  reduced  in  size  and  two  are  increased 

and  that  each  of  the  increases  is  paralleled  by  a 

corresponding  reduction  in  the  other  class.  There 


is  no  evidence,  therefore,  that  the  effect  of  the 
inconsistencies  can  be  reduced  by  corrections  for 
guessing. 

Magi  11  draws  the  following  conclu- 
sions from  his  experiment: 

1.  High  intercorrelations  may  be 
accompanied  by  high  percentages  of  inconsistence 
in  the  response  to  specific  items; 

2.  The  percentage  of  inconsistency 
is  widely  variable  in  size  and  also  varies  inversely 
with  the  gross  scores,  so  that  it  cannot  be  considered 
to  be  due  to  the  influence  of  constant  factors,  which 
might  be  eliminated  by  statistical  treatment  of  the 
score;  and 

3.  Influence  of  the  inconsistency 

in  response  upon  the  gross   scores  is  not  consistently 
reduced  by  correcting  the  scores  for  chance. 

He  concludes  that  test  constructors 
are  on  safer  ground  when  they  strive  to  so  select 
and  use  test  item  forms  that  they  represent  direct 
measures  of  the  items  of  mental  attainment  under 
measurement  than  when  they  use  the  forms  indiscrim- 
inately under  assumptions  of  equivalence  in  validity. 


There  is  no  doubt  that  the  measure- 
ment of  the  validity  of  each  type  of  objective  test 
presents  one  of  the  most  important  problems  for 
research.     The  difficulty  in  determining  validity 
lies  in  the  selection  of  an  adequate  criterion  of 
success  in  the  subject.     Various  studies  have 
used  different  criteria  and  their  findings  must 
be  considered  in  relation  to  the  adequacy  of  these 
criteria . 

The  researches  in  this  field  have 
been  summarized  by  Lee  and  Symonds  who  have  drawn 
the  following  conclusions  as  regards  the  validity 
of  the  various  types  of  objective  tests:42 

1.  Objective  tests  with  the  excep- 
tion of  the  true-false  tests  seem  to  be  slightly 
more  valid  than  the  essay  examination; 

2.  Completion  test  is  superior 
as  far  as  validity  is  concerned  to  other  types 
of  objective  tests; 

3.  True-false  tests  appear  to 
be  the  least  valid  objective  type,  but  modified 
forms  of  it  increase  its  validity; 
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4.  Objective  tests  correlate  higher 
than  do  essay  examinations. 

The  reliability  of  a  test  is  second 
only  to  validity  as  a  creterion  of  the  worth  of  a 
test.     Symonds  believes  that  "It  is  perhaps  as  hard 
to  construct  a  test  with  the  desired  reliability 
as  it  is  to  construct  one  with  high  validity.  "^ 
Reliability  may  be  defined  as   "the  degree  to  which 
scores  made  upon  a  test  at  one  time  agree  with 
scores  made  by  the  same  pupils  upon  the  same  test 
at  another  time.    The  expression   'same  test'  should 
be  interpreted  to  include  not  merely  an  identical 
test,  but  also  a  similar  and  duplicate  test."^ 

Since  in  mo  st  cases  but  one  form 
of  test  is  available,  the  practical  method  of 
determining  reliability  is  to  divide  any  test  into 
two  equivalent  halves.     This  may  be  done  by  consid- 
ering all  of  the  odd  numbered  items  as  one  test  and 
all  of  the  even  numbered  items  as  a  second  test. 
The  coefficient  of  correlation  between  the  scores 
of  many  students  on  the  odd  numbered  items  and  the 
scores  of  the  same  students  on  the  even  numbered 
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items  is  then  determined. 

Toops„r  has  studied  the  reliability 
of  new- type  tests  using  two  different  methods  with 
results  as  follows: 

1.  Reliability  of  halves,  124  cases 
(2  forms  of  25  statements  each) : 

2.  Reliability  of  two  50-question 

sets   (Brown's  formula): 

Recall  Recognition  True-False 

.448  .385  .340 

.518  .556  .5C7 

In  order  of  decreasing  reliability, 

the  tests  stand  in  the  order  of  recall,  recognition, 

and  true-false. 

Ruch  and  Stoddard^g  experimented 
with  a  100- information  item  test  covering  the 
general  field  of  history  and  the  social  sciences, 
suitable  in  difficulty  for  twelfth- grade  pupils. 
These  items  were  next  divided  by  chance  into  two 
approximately  equal  "forms"  designated  as  Form  A 
and  Form  B.    The  items  were  then  adapted  to  each 
of  the  following  five  types  -with  the  subsequent 
results : 

45.  Toops,  H.  A.  Trade  Tests  in  Education 
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Reliability  of 
100  items  "by 
Spearman  Brown 
Type                 Form  A  versus  B  Formula 

Recall                 .83X010                     .  .90 
5  Response          .30X021  .89 
3  Response          .00^.037  .75 
2  Response          .74X027  .35 
True-false          .56^.040  .71 

In  order  to  keep  practice  effects 

at  as  nearly  a  minimum  as  possible,  it  seemed 

inadvisable  on  the  part  of  the  experimenters  to 

have  each  pupil  take  the  two  forms  in  all  five  ways. 

For  this  reason  all  pupils  were  given  the  recall 

type  Form  A  followed  directly  by  Form  E,  and  then 

one  day  later  were  given  the  same  items  in  one 

other  type-form.     The  experiment   involved  more  than 

500  pupils;  sub-groups  used  for  statistical  purposes, 

totalling  135,  were  random  samplings  of  the  larger 

group . 

There  is  close  agreement  between 
this  study  and  that  of  Toops. 

In  the  summary  of  researches  made 
in  this  field  and  mentioned  before  in  the  matter 
of  validity,  the  following  conclusions  have  been 
drawn  by  Lee  and  Gymonds: 

1.  Objective  tests  have  higher 

_______ — 

reliability  than  essay  examinations; 

2.  T.Todified  true-false  tests  have 
a  higher  reliability  than  does  the  usual  true-false 
test. 

STATEMENT  OF  THE  PROBLEM  TREATED 

Students  are  quite  positive  that 
the  mimeographed  true-false  test  is  "fairer"  than 
the  oral  test.     By  this  is  meant,  they  feel  they 
can  produce  a  higher  score  by  the  reading  method 
than  by  the  listening  method. 

It  is  felt  that  more  specific 
training  should  be  given  listening  ability  in 
our  schools.     However,  listening  ability  may  never 
receive  the  proper  emphasis  in  the  schools  until 
we  place  more  of  a  premium  upon  it  in  our  exami- 
nations . 

Necessity  of  placing  stress  on 
this  ability  is  revealed  by  Rankin.^    His  data 
show  frequency  of  use  with  respect  to  several 
types  of  communicative  ability.     The  study  reveals 
A2%  of  the  "waking  time"  is  spent  in  listening, 
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32$  in  talking,  15%  in  reacting,  and  lli  in  writing. 
Listening  occupies  almost  three  times  as  much  activ- 
ity as  reading. 

Lehman^g  presents  objective  data 
that  appear  contradictory  to  popular  opinion.  His 
study  signified  that  the  results  obtained  by  the 
listening  method  correlate  with  the  reading  method 
as  much  as  the  results  from  the  latter  method  corre- 
late with  themselves;  i.e.,  the  listening  method 
produces  virtually  as  consistent  results  as  the 
reading  method.    Then  again  students  have  the  con- 
viction that  they  make  many  more  errors  by  the 
listening  method  than  by  the  reading  method. 

Lehman's  data,  on  this  point,  collide 
with  popular  prejudice.     By  a  study  of  27,969  answers 
on  true-false  statements,  he  found  that  25. 15$  of 
the  errors  were  made  on  the  reading  and  24.74$  by 
the  listening  method.     From  his  study,  we  might 
conclude  that  the  listening  method  produces  sub- 
stantially the  same  results  as  the  reading  method. 

In  his  experiment,  two  modes  of 
presentation  were  used,  the  oral  and  the  reading. 

48.  Lehman,  H.  G.  The  Oral  Versus  t  he  Mine  ographed 
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The  test  consisted  of  eighty-five  true-false 
statements  given  to  nine  classes  in  educational 
psychology.     The  statements  were  first  presented 
orally  within  a  twenty-five  minute  period.  This 
presentation  was  followed  immediately  by  the  dis- 
tribution of  mimeographed  copies  of  the  identical 
set  of  eighty-five  true-false  statements.  Lehman 
states  that  "since  the  oral  presentation  preceded 
the  mimeographed  presentation,  it  seems  unlikely 
that  the  order  of  presentation  prejudiced  the 
quiz  results  in  favor  of  the  oral  presentation." 

Average  coefficients  of  correlation 
for  the  nine  classes  were  as  follows: 

Mimeographed  odds  versus  mimeographed  evens  -  .472 

Oral  odds  versus  oral  evens  .512 

Oral  odds  versus  mimeographed  evens  .439 

Oral  evens  versus  mimeographed  odds  .489 

Comparison  of  the  first  two  corre- 
lations reveals  that  the  coefficient  is  slightly 
higher  for  the  oral  than  for  the  mimeographed  pre- 
sentation.    Although  this  difference  is  of  negligible 
magnitude,  it  reveals  nevertheless  that  for  the  study 
reported  the  oral  presentation  was  no  less  reliable 
than  the  mimeographed  presentation. 

In  comparing  the  third  and  fourth 
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correlations,  int ercorrelatlons  were  as  large  as 
the  self-correlations. 

In  harmony  with  Lehman's  findings, 
Jensen's  study  reveals  that  virtually  the  same 
results  are  obtained  by  the  listening  method  as  by 
the  reading  method.     There  was  a  slight  advantage 
of  the  reading  method  over  the  reading-listening 
method.     In  Jensen's  experiment^  three  presenta- 
tions were  given  to  nine  classes  all  within  the 
same  period;  visual,  oral,  and  visual-oral   (the  . 
instructor  read  statements  to  the  class  simulta- 
neously with  their  reading  from  and  recording  their 
responses  on  mimeographed  sheets).     The  nine  classes 
consisted  of  three  in  beginning  psychology  and  six 
in  freshmen  college  English.    The  examination  con- 
sisted of  fifty  statements  in  each  instance. 
Jensen  states  that  practice  effects  were  controlled 
by  using  three  classes  in  the  same  subject  under 
the  same  instructor  and  varying  the  order  of  pre- 
sentation of  the  examination  so  that  equal  amounts 
of  practice  would  accrue  to  each  method. 

As  a  further  control,  equal  numbers 
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of  papers  were  taken  (at  random)  from  each  of  the 
classes,   that  is,  twenty- five  from  each  of  the 
psychology  classes  and  thirty  from  each  of  the 
English  classes,  making  a  total  of  255  papers. 

Jensen  found  a  slightly  higher 
correlation  with  the  oral  over  trie  visual  (in 
harmony  with  Lehman's  findings)  and  of  the  visual 
over  the  visual-oral  (a  comparison  Lehman  did  not 
make ) . 

The  coefficients  for  each  of  the 
three  methods  of  presentation  were  as  follows: 

Visual  Oral  Visual -Oral 

Psychology  .59i:.05  .63  £.05  .51^.06 

English  .87^.01  .831;  01  . 86^.01 

Group  I 

English  .82^01  .88^.01  .82^,01 

Group  II 

The  superior  accuracy  of  the  English 
over  the   psychology  examination  may  he  partially 
accounted  for  by  its  shorter  statements  and  greater 
def initeness--it  consisted  of  sentences  to  be  marked 
as  to  correctness  of  punctuation;  the  psychology 
examination  was  built  to  cover  the  concepts  treated 
in  certain  chapters  of  the  test  used  by  the  students. 
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In  Stump's  study, ^  data  were 
obtained  from  five  classes  in  first-year  college 
subjects,  thereby  securing  a  total  of  7,363  reac- 
tions.    The  five  classes  consisted  of  three  classes 
of  Normal  School  pupils,  two  in  Elementary  Tests  and 
Measurements,  27  and  2  2  pupils  respectively,  and 
one  in  Elementary  Educational  Psychology,  37 
pupils;  the  two  college  classes  consisted  of  23 
pupils  each  in  High  School  Tests  and  Measurements. 

An  oral  true-false  test  was  given 
at  the  end  of  the  third  week  of  study  in  each 
class.     A  week  following  the  first  test,  the 
experimenter  presented  the  same  statements  as  before 
in  mimeographed  form.     The  pupils  were  asked  whether 
they  could  recall  completely  any  of  the  statements 
and  a  majority  stated  that  not  a  single  statement 
could  "be  clearly  remembered.     It  may  be  concluded 
that  the  first  examination  would  little,  if  any, 
influence  the  results  on  the  second. 

The  average  coefficient  of  corre- 
lation between  the  reading  scores  and  the  oral 
scores  was  .47; accordingly,  Stump  states  that  this 

50.     Stump,        F.  Oral  Versus  the  Printed  Lie th od  in 

the  Presentation  of  the  True-?aT~se 
Examination    Journal  Educational 
Research  Vol.  13:423-4  D'28 


degree  of  correlation  would  indicate  that  the 
extra  time  spent  in  mimeographing  examinations 
was  not  justified. 

Crawford ^     states  that  in  work 
done  to  determine  whether  true-false  tests  measure 
student  knowledge  as  well  when  presented  orally 
as  when  presented  in  mimeographed  form,  the  results 
have  generally  shown  that  the  oral  method  is  as 
good  as  the  mimeographed  method.     Student  reaction 
to  the  oral  method,  however,  is  often  unfavorable, 
with  the  result  that  class  morale  or  t eacher- pupil 
harmony  sometimes  suffers  if  the  oral  method  is 
used  extensively. 

Some  students  are  firmly  and  un- 
alterably opposed  to  the  oral  method  and  persist 
in  classifying  themselves  as  martyrs  when  they 
are  so  tested. 

One  hundred  twenty  University 
students  were  given  two  tests,  one  oral  and  one 
mimeographed.     Each  test  consisted  of  fifty  state- 
ments and  they  were  alternate  forms  of  a  test 
prepared  by  the  author  of  the  textbook  used.  No 

51 .  C rawf ord ,  C .  C .  Preference  Versus  Performance  in 

'Facing  Oral  True-False  Tests 
school  KerTewTU:  138-41  F'52 


norms  were  published  for  the  two  forms,  but  ill 
was  thought  that  the  tests  were  more  nearly  stan- 
dardized for  equal  difficulty  than  are  the  usual, 
informal,  teacher-made  tests. 

Scores  tabulated  according  to 
whether  each  student  did  better  by  6bj§  oral,  by 
the  mimeographed,  or  by  neither  method.     Thus  a 
student  might  prefer  the  oral  method  but  actually 
do  better  on  the  mimeographed  form,  or  he  might 
have  no  preference  and  actually  do  better  on  the 
oral  part  and  so  on. 

The  distribution  of  the  prefer- 
ences and  of  the  performance  was: 

42  students  preferred  the  mimeographed 

method 

43  1  "t       the  oral  method 
55          "  neither  method 

This  vote  was  taken  after  the 
class  had  had  considerable  experience  with  both 
methods • 

The  coefficient  of  correlation 
.081;. 06  (with  a  standard  error  of  .09)  best 
summarizes  the  extent  of  the  relation.  This 
coefficient  is  so  low  that  it  may  be  interpreted 


as  no  correlation  at  all.     In  other  words,  students' 
notions  as  to  which  methods  give  them  the  best 
scores  are  o  f  no  value  whatever  as  indications  of 
the  real  facts  of  the  case. 

This  investigation  reveals  no 
reason  why  faith  should  be  placed  in  sbudents' 
judgments  of  the  related  values  of  the  oral  method 
and  the  written  method  of  presenting  true- false 
tests,  since  preferences  and  performances  show  no 
correlation  that  cannot  be  ascribed  to  mere  chance. 

In  the  writer's  experiment,  the 
problem  is  attacked  from  a  different  point  of  view. 
In  neither  Lehman's  or  Jensen's  study  were  abilities 
as  measured  by  mental  tests  taken  into  consideration. 
In  this  study,  the  index  of  "fairness"  of  each 
method  is  regarded  in  relation  to  the  learning 
ability  as  measured  by  mental  ability  tests. 

In  terms  of  the  ideal,  assuming 
that  pupils  do  justice  to  themselves,  those  with 
good  ability  should  make  good  scores,  those  with 
average  ability  average  marks,  and  those  with  poor 
ability  poor  marks,  especially  when  objective  exam- 
inations which  minimize  the  personal  element  in 
grading  are  administered. 


DESCRIPTION  OF  THE  EXPERIMENT 


The  purpose  of  the  writer's 
experiment  is  to  present  data  upon  this  problem, 
listening  versus  reading  of  true -false  tests, 
which  continues  to  attract  considerable  attention. 

In  all  the  researches  studied,  all 
the  experiments  were  found  to  have  been  performed 
in  either  normal  school  or  college.     In  no  instance 
found  had  an  experiment  been  given  in  high  school, 
and  subsequently  none  was  found  to  have  been  given 
in  the  field  of  commercial  education. 

For  the  writer's  experiment,  the 
four  classes  of  twenty-three  pupils  each  in 
Elementary  Bookkeeping  were  included.  Sections 
in  this  particular  High  School  are  formed  the 
preceding  year  by  the  principal,  using  the  Intelli- 
gence Quotients  of  the  pupils  to  form  the  classes 
in  English.     This  classification  for  English 
determines  the  grouping  in  the  other  subjects. 
Two  of  these  sections  of  twenty-three  pupils  each 
were  designated  as  Group  A  and  the  other  two 
sections  as  Group  B.     The  number  of  pupils  included, 
in  the  experiment  totalled  ninety-two,  largely 


second -year  pupils  in  High  School,  with  a  few 
third-year  pupils. 

Statements  used  in  the  examina- 
tion were  taken  from  two  sources:  Elwell-Fowlkes 
Bookkeeping  Test  I,  Form  A  and  Form  B;  the  Carlson 
Bookkeeping  Tests  1,  2,  3,  and  4. 

The  Elwell-Fowlkes  tests  are 
intended  primarily  for  measuring  general  achieve- 
ment and  are  not  based  on  any  specific  textbook. 
Test  I,  referred  to  above,  is  intended  to  cover 
the  first  semester  of  Bookkeeping. 

Each  of  the  Carlson  Tests  is  an 
objective  test  based  upon  a  complete  analysis  of 
a  definite  section  of  the  textbook,  "20th  Century 
Bookkeeping  and  Accounting":     Test  1  is  based 
upon  material  from  Chapter  I  to  Chapter  IV 
inclusive;  Test  2,   upon  Chapter  V  to  Chapter  VII: 
Test  3,  upon  Chapter  VIII  and  Chapter  IX;  Test  4, 
Chapter  X  to  XV. 

In  listing  the  statements  taken 
from  the  Elwell-Fowlkes  Test,  those  concerning 
principles  which  had  not  been  taught  to  the  pupils 
up  to  the  time  of  the  experiment  (March,  1934) 
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were  omitted. 

The  following,  taken  from  Page  2, 
Form  B,  is  shown  by  way  of  illustration: 

1.  The  total  of  all  the  debits  as 
recorded  in  the  ledger  accounts 
should  equal  the  total  of  all 
the  credits. 

2.  The  cost  of  merchandise  sold  is 
always  the  difference  between  the 
total  merchandise  sales  and  the 
total  merchandise  purchases. 

3.  The  debits  in  a  book  of  original 
entry  are  posted  as  credits  in 
the  ledger. 

4.  A  net  profit  increases  the  pro- 
prietary interest  (proprietorship). 

5.  A  credit  balance  in  the  Proprietor's 
Drawing  (Personal)  account  at  the 
close  of  the  first  period  in  business 
indicates  that  the  net  profit  is 

in  excess  of  the  withdrawals. 

6.  ".Tien  an  interest-bearing  note  is 
given  in  payment  of  an  account, 
'Totes  Payable  and  Interest  Expense 
(Interest  Cost,  Interest  Paid) 

are  credited. 

7.  A  debit  balance  in  the  Notes 
Receivable  account  indicates  that 
all  notes  received  have  not  been 
paid. 

8.  A  separate  posting  to  the  Gash 
account  is  made  for  each  item  in 
the  cash  journal  (cash  book). 


9.     The  entry  to  record  the  receipt 
of  a  note  from  a  customer  is  made 
in  the  general  journal  (journal). 

10.    The  closing  entries  for  a  "business 
are  usually  made  at  the  end  of 
each  fiscal  period. 

Numbers  1,  2,  3,  7,  8,  9,  and  10 
were  included.    Number  4  was  omitted  because  it 
was  similar  to  a  statement  already  taken  from 
Form  A  (Form  A  and  Form  B  as  stated  by  the  pub- 
lishers v/ere  supposed  to  be  alike  in  organization 
and  almost  equal  in  difficulty,  differing  only 
in  specific  content).     In  regard  to  the  omission 
of  statement  Ho;  5,  while  the  particular  account 
mentioned  had  been  taught,  the  method  of  handling 
it  differed  and  the  pupils  would  have  been  unable 
to  answer  it.    The  material  included  in  statement 
No.  6  had  not  been  taught  at  the  time  and  would 
not  be  taught,  according  to  the  outline  of  the 
course,  until  T1ay. 

After  the  material  in  the  Elwell- 
Fowlkes  test  was  erhausbed,  the  balance  of  the 
statements  used  was  taken  from  the  Carlson  Tests: 
the  reason  for  this  procedure  being  that  the 
Ml well-Fowlke s  test  covered  general  information, 


c 
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while  the  Carlson  tests  covered  specific  infor- 
mation contained  in  definite  chapters  of  the 
textbook  used  in  class,  much  of  which  at  the 
time  of  the  examination  was  "old  material"  to 
the  pupils . 

For  example,  in  Test  I,  covering 
Chapters  I  to  IV,  statements  which  referred  to 
subject-matter  taught  in  November  and  which  had 
not  undergone  any  change  or  enlargement  were 
not   included  because  they  v/ould  have  proved  too 
simple  and  v/ould  not  have  served  as  good  test 
items.     However,  statements  referring  to  material 
also  taught  in  November,  but  which  had  been 
enlarged  upon  since  then,  thus  requiring  some 
thought  in  answering,  were  included. 

The  following  taken  from  Carlson 
Test  I  will  illustrate  the  above  paragraph: 

11.  All  increases  in  Assets  are 
recorded  in  some  asset  account 
as  credits. 

12.  All  increases  in  Proprietorship 
are  recorded  in  the  account  with 
the  proprietor  as  debit s. 

13.  All  increases  in  Income  are  recorded 
in  some  income  account  as  debits. 
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14.  All  increases  in  Expense  are 
recorded  in  some  expense  account 
as  credits. 

15.  All  cash  receipts  are  recorded 
in  the  cash  account  as  debits. 

16.  All  cash  payments  are  recorded 
in  the  cash  account  as  credits. 

17.  All  of  the  proprietor's  invest- 
ments in  the  business  are  recorded 
in  his  account  as  debits. 

18.  All  sales  are  recorded  in  the 
sales  account  as  credits. 

19.  All  purchases  are  recorded  in 
the  purchases  as  debits. 

20.  All  expenses  are  recorded  in 
some  expense  account  as  credits. 


Numbers  11,  12,  16,  19,  and  20 
were  included  and  the  remaining  numbers  omitted 
for  reasons  gi\jen  above. 

The  total  number  of  statements 
included  in  the  experiment  was  100.  Symonds 
says  "In  general  true-false  tests  are  not  very 
reliable  unless  one  hundred  or  more  statements 
are  included . " ^    Ruch  in  discussing  the  objective 
examination  in  regard,  to  length  says,  "Long  tests 
may  be  expected  to  be  more  valid  than  short  tests 


52.  Ruch,  G.  H.  The  Objective  or  'lew- Type  Examina- 
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and  tf  a  test  is  made  long  enough,  it  will  usually 
yield  a  reasonabl:/  valid  measure  even  if  many 
individual  items  are  faulty  or  worthless .  "^^ 

In  regard  to  the  validity  and 
reliability  of  the  Elwell-Fowlkes  Tests,  the 
following  appears  in  their  Manual  of  Directions: 

"Because  of  the  large  number  of 
questions  and  the  variety  of  informational  items 
involved,  the  test  is  much  more  reliable  and  valid 
than  the  customary  final  examination  in  bookkeeping. 
Also  the  tests  cover  the  material  and  activities 
offered  throughout  the  country  during  the  first 
year  of  bookkeeping.     The  reliability  correlation 
between  Form  A  and  Form  B  is,  for  Test  1,  .321^013." 

In  regard  to  the  validity  and 
reliability  of  the  Carlson  Tests,  Mr.  Carlson  in 
an  article  on  "".hat  Is  a  Good  Test  in  Business 
Education?"  in  the  Balance  Sheet,  May,  1932,  quotes 
o;r".ond's  and  Ruch  and  Stoddard's  theory  of  valida- 
tion (which  have  already  been  included  in  this  thesis 
in  connection  with  the  subject  of  validity)  and  goes 
on  to  say  that  "It  is  evident  from  the  foregoing 
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discussion  fiat  we  may  construct  tests  which  are 
valid  for  a  single  textbook  or  we  nay  construct 
tests  which  are  valid  for  a  course  of  study." 
Each  of  the  Carlson  tests  is  an  objective  test 
based  upon  a  complete  analysis  of  a  definite 
section  of  the  sixteenth  edition  of  the  sixteenth 
edition  of  20th  Century  Bookkeeping  and  Accounting-- 
the  textbook  used  by  the  aforementioned  ninety-two 
pupils. 

The  coefficients  of  reliability 
are  the  coefficients  of  correlation  between  the 
odd  and, even  items,  corrected  according  to  the 
Spearman-Brown  Formula  and  are  as  follows: 
Test  l--.cC>5;  Test  2— .937:  Test  3— .939;  Test  4--. 918; 
Test  5 — .896.     In  computing  these  coefficients, 
a  limited  number  of  papers  were  used.     The  groups 
of  papers  were  selected  at  random,  but  in  each  case 
all  papers  in  one  group  or  class  were  included. 

Both  the  Slwell-Fowlkes  and 
Carlson  tests  consist  of  completion,  multiple-choice, 
matching,  and  true-false  items.     It  has  been  stated 
in  the  discussion  on  reliability  that  in  order  of 
decreasing  reliability,  the  various  types  of  ob.jec- 


tive  tests  stand  in  the  order  of  recall,  recognition 
and  true-false,     This  will  account  for  the  lower 
correlation  found  in  this  experiment  "between  the 
odd  and  even  items  of  the  true -false  test  as  com- 
pared with  the  high  correlations  of  both  the  Elwell- 
Fowlkes  and  Carlson  Tests. 


a  group  of  twenty-five  students  in  Advanced  Book- 
keeping for  the  purpose  of  determining  the  scale 
of  difficulty.     The  one  hundred  statements  were 
then  re-arranged  according  to  frequency  of  errors 


two  groups  of  fifty  each,  the  odds,  1-99,  and  the 
evens,  2-100.     The  odds,  1-99,  were  presented 
orally  to  Group  A  and  on  mimeographed  sheets  to 
Group  B;  the  evens,  2-100,  ire  re  presented  orally 
to  Group  B  and  on  mimeographed  sheets  to  Group  A. 


Group  A  and  Group  B  respectively  on  one  day,  and 
the  reading  tests  to  Groups  A  and  B  respectively 
the  following  day. 

The  method  used  by  the  examiner 
(the  v/riter)  in  administerinr;  the  tests  was  as 


The  test  was  first  presented  to 


They  were  divided  into 


The  oral  tests  were  presented  to 
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follows : 

In  the  case  of  the  oral  tests, 
the  students  were  asked 

(1)  to  write  their  names  in  the 
upper  right  hand  corner  of  their  ruled  sheets  of 
paper; 

(2)  to  write  the  numbers  from 
1-25  on  one  sheet  and  from  26-50  on  the  second 
sheet ; 

At  this  point,  students  were 

informed  that 

(1)  each  statement  would  be  read 
twice;   (the  first  reading  was  to  assist  in  orien- 
tation, while  during  the  repetition,  the  pupils 
could  concentrate  upon  the  decision  of  "True"  or 
"False")  ; 

(2)  true  statements  were  to  be 
indicated  by  a  plus  (^~)  sign  placed  at  the  right 
of  the  corresponding  number  on  the.  sheets  of  ruled 
paper; 

(3)  false  statements  were  to  be 
similarly  indicated  by  the  use  of  a  (-)  sign;  and 

(4)  no  questions  would  be  per- 


::iitted  regarding  the  reading  by  the  examiner  of 
the  true-false  statements. 

In  the  experiment,  the  examiner 
now  read  the  first  true-false  statement  from  her 
examination  paper,  pronouncing  each  word  as  dis- 
tinctly as  possible.     At  the  conclusion  of  the 
first  reading,  the  examiner  counted  silently  and 
as  rapidly  as  possible  from  1  to  10.    7/ith  no 
further  delay  than  that  involved  in  counting  from 
1-10,  the  examiner  then  reread  the  first  statement 
and  again  counted  from  1  to  10  before  proceeding 
to  the  second  statement . 

In  the  reading  method,   each  pupil 
was  furnished  the  statements  so  arranged  on  mim- 
eographed sheets  that  a  line  could  be  drawn  under 
the  word  "True"  or  "False"  which  appeared  to  the 
right  of  each  statement.     The  papers  for  both 
presentations  were  scored  by  the  "number  right" 
formula. 

The  intelligence  quotients  for 
the  ninety-two  pupils  were  secured  by  administering 
the  Otis  Self-Administering  Test  of  Mental  Ability, 
Form  A. 
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All  the  coefficients  of  correlation 

are  handled  statistically  using  the  Pearson  Product 

Moment  Method  of  correlation  and  the  Yule  method 

of  Partial  Correlation^. 

o4 

FINDINGS  FR0:,1  THE  DATA 

The  relationship  between  scores 
made  by  the  listening  procedure  when  correlated 
with  the  scores  made  by  the  reading  method  within 
the  respective  groups  is  shown  in  Table  I.  The 
correlations  of  .64  and  .61  are  practically  iden- 
tical.    This  shows  that  the  two  groups  as  individ- 
ual groups  did  equally  well  when  both  methods  of 
presentation  were  correlated. 

TABLE  I 

CORRELATION  BETWEEN  ORAL  AND 
•   READING  SCORES 


Group 

No.  of 

Odds 

Evens 

R 

Pupils 

1-99 

2-100 

A 

46 

Oral 

Reading 

•  64X06 

B 

46 

Reading 

Oral 

.61X06 
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By  combining  Groups  A  and  B,  the 
writer  found  that  the  correlation  between  all  the 
orals  and  all  the  readings  is  .62  as  shown  in 
Diagram  IX.     This   shows  that  in  the  combination 
of  the  tv/o  groups,  giving  a  total  of  92  cases, 
the  correlation  between  all  the  orals  and  all 
the  readings  was  practically  as  large  as  the 
correlation  between  the  odds  and  evens  in  the 
separate  groups . 

'.Then  potential  ability  is  con- 
sidered, it  is  interesting  to  note  which  method 
of  procedure  indicates  a  "fairer"  ranking  of 
the  pupils. 

Under  ideal  conditions,  we  would 
expect  those  with  high  potential  ability  to  make 
high  scores,  those  with  low  potential  ability  to 
make  low  scores,  and  those  with  average  ability 
average  scores.     V'e  never  get  this  ideal  condition, 
but  we  can  at  least  state  which  of  these  two  methods 
produces  the  "fairer"  ranking  of  students  v/hen 
potential  ability  is  measured  by  means  of  the 
intelligence  quotient. 

The  coefficient  of  correlation 


showing  this  relationship  between  scores  made  "by 
the  oral  procedure  and  mental  ability,  also  the 
relation  ship  between  scores  in  the  reading  method 
and  mental  ability  for  each  group  are  given  in 
Table  II. 

TABLE  II 

CORRELATION  BETWEEN  SCORES  AND 
INTELLIGENCE  QUOTIENT 


Group  No.  of  Oral  Scores  Reading  Scores 
 Pupi Is  and  I.fl,.  and  I,   

A  46  .49-t; 07   (1-99)     .38±;03  (2-100) 

B  46  .34X09  (2-100)    .45^.08  (1-99) 


In  the  Group  A,  the  correlation 
between  the  oral  scores  and  intelligence  quotient 
for  the  odds   (.49)  was  higher  than  that  of  the 
same  statements  given  by  the  reading  method  and 
the   intelligence  quotient  (.38).     In  comparing 
the  coefficient  of  correlation  for  the  B  group, 
the  correlation  between  the  reading  scores  and 
the  intelligence  quotient  (.45)  is  higher  than 
the  correlation  between  the  oral  scores  and  the 


intelligence  quotient  (.34). 

In  other  words,  the  mental  ability 
of  Group  A  compared  more  favorably  with  the  oral 
presentation  than  with  the  reading  presentation, 
while  in  the  case  of  Group  B,  conditions  were  the 
reverse. 


the  correlation  of  the  odd  items  (1-99)  with  the 
mental  ability  of  each  group  is  practically  identi 
cal  for  the  two  presentations;  the  same  is  true 
of  the  even  items,  but  with  a  lesser  degree  of 
corre la  tion. 


groups  as  shown  in  Table  III  varies  by  only  one 
point . 


CORRELATION  BET 'TEEN  SCORES  AND  INTELLIGENCE 
QUOTIENT  FOR  THE  COMBINED  GROUPS 


It  is  interesting  to  note  that 


The  correlation  for  the  combined 


TABLE  III 


Groups 


Variables 


R 


A  and  B  (92) 


I.Q.  and  all  orals 


.  37±;  06 


A  and  B  (92) 


I.G.  and  all  readings 


.33^.06 


72 

The  similarity  between  the  above 
correlations  signifies  that  the  oral  method  of 
presentation  tells  as  true  and  as  "fair"  a  story 
of  the  achievement  of  the  pupils  as  the  reading 
method  does. 

The  partial  correlation  for  the 
correlation  of  .  62j±;04  between  all  orals  and  all 
readings,  using  the  correlations  as  illustrated 
in  Table  III  for  variables,  thus  holding  intelli- 
gence quotient  constant,  was  found  to  be  .56  as 
shown  in  Table  IV. 

TABLE  IV 

PARTIAL  CORRELATION  HOLDING 
INTELLIGENCE  QUOTIENT  CONSTANT 

r12"  GrouP  A  and  D    r  -.37  between  I.Q. 

and  all  orals 

Group  A  and  B    r  -.38  between  I.Q. 

and  all  readings 

r23"  Group  A  and  B    r  -.62  between  all  orals 

and  all  readings 

1/  i-*aa  /i-r2i3 
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The  writer  was  interested  to 
know  whether  the  students  made  more  errors  when 
the  true-false  test  was  present ed  orally  than 
when  presented  on  mimeographed  sheets.     The  total 
number  on  errors  for  each  method,  of  presentation 
was  tabulated  and  percentages  figured  as  shown  in 
Table  V. 

The  number  of  pupils  who  pre- 
ferred the  oral  method,  of  presentation  to  the 
written  was  found  to  be  15%,  the  remaining  85% 
preferring  the  mimeographed  sheets.     This  is 
particularly  interesting  in  view  of  the  results 
as  shown  in  Table  V  which  shows  that  the  pupils 
s  eemed  to  do  as  well  by  the  oral  as  by  the 
reading  procedure. 

TABLE  V 

TOTAL  NUMBER  OF  "ERRORS 
AS  EXPRESSED  IN  PER  CENT 


Group  Test  No.  of  Group  Test  No.  of 
 Errors   Errors 

A  (46)      Oral  559  A  (46)     Reading  486 


E     (46)     Oral          498            B  (46)     Reading  437 

1057  923 

Total  number  of  answers  92  x  100    -  9200 
Errors  in  %  -  .11                Errors  in  %  -  .10 

Inasmuch  as  the  coefficient  of 
correlation   (.62£T.04)  between  the  entire  test 
given  orally  and  the  entire  test  given  by  the 
reading  method  is  as  high  as  the  correlation  of 
the  test  itself,   (.64  and  .61)  it  maybe  safely 
concluded  that,  in  so  far  as  the  present  study 
is  concerned,  the  oral  and  the  mimeographed  pre- 
sentation measured  identical  abilities  and  that 
they  measured  these  abilities  with  approximately 
equal  effectiveness. 

This  study,  however,  does  not 
purport  to  show  that  the  oral  presentation  is 
equally  fair  to  every  pupil.     Neither  mode  of 
presentation  will  enable  every  student  to  make 
his  best  possible  showing  since  a  few  pupils  are 
likely  to  be  handicapped  by  inferior  hearing 
ability  and  a  few  others  tire  likely  to  be  handi- 
capped by  inferior  reading  ability. 

While  the  oral  true- false  test 
would  avoid  the  labor  and  expense  involved  in 
mimeographing  true -false  examinations,  there  are 
other  factors  to  be  considered.    Undoubtedly,  some 
instructors  would  not  read  distinctly  enough  to 


"be  understood  when  directing  the  examination  and 
some  would  read  the  statements  so  as  to  "point" 
the  answers. 

The  oral  presentation  revealed 
that  the  oral  test  as  given  afforded  the  following 
advant  ages : 

(1)  vhen  each  statement  is  read 
twice,  the  rapid  count  from  1-10  provides  suffi- 
cient time  for  students  to  recognize  and  to 
indicate  its  truth  or  its  falsity; 

(2)  Since  the  students  know  in 
advance  that  no  statement  will  be  read  more  than 
twice,  the:/  mrke  a  sincere  effort  to  understand 
the  first  reading; 

(3)  The  uniform  length  of  the 
pauses  between  statements  permits  the  students  to 
anticipate  accurately  the  reading  of  successive 
statements  and  they  are  then  enabled  to  concentrate 
maximum  attention  in  the  direction  of  the  examiner 
at  the  appropriate  moment; 

(4)  The  above  plan  also  permits 
the  students  brief  intervals  of  more  or  less  com- 
plete relaxation.     Both  time  and  energ:/  arc  thus 


conserved . 

Further  investigation  is  essential 
to  the  solution  of  this  problem  if  we  are  to  shape 
our  testing  procedures  in  conformity  with  individ- 
ual differences  appearing  in  the  amount  that  a 
student   "knows"  when  the  same  examination  stimuli 
are  presented  by  different  methods.     Until  such 
investigations  are  made,  the  oral  method  of  pre- 
senting true -false  examinations  may  be  considered 
as  effective  as  the  visual,  and  considerably  more 
desirable,  because  of  the  resultant  economy  in 
both  time  and  money. 


TEST 

1.  A  net  loss  in  the  business  increases 

the  proprietorship.  True  False 

2.  The  purpose  of  the  Trial  Balance  is 

to  prove  the  equality  of  debits  and  credits.  True  False 

3.  A  decrease  in  the  asset  ca3h  is  debited 

to  the  Gash  Account.  True  False 

4.  A  debit  in  the  Rent  Expense  account 
indicates  a  payment  for  the  use  of  store  or 

office.  True  False 

5.  The  left  side  of  any  account  is  used 

to  record  credits.  True  False 

6.  The  transfer  of  debits  and  credits 
from  the  books  of  original  entry  to  the  ledger 

is  called  posting.  True  False 

7.  T-ie  difference  between  the  two  sides 

of  an  account  is  called  the  balance.  True  False 

8.  A  check  received  is  recorded  in  the 

cash  receipts  side  of  the  Cash  Book.  True  False 

9.  Proprietorship  equals  assets  minus 

liabilities.  True  False 

10.  Depreciation  Reserve  account  is  a 

liability".  True  False 

11.  An  investment  in  a  business  made  by 

a  proprietor  is  credited  to  his  capital  account.  True  False 

12.  The  excess  of  debits  in  the  Cash 

account  over  the  credits  shows  cash  on  hand.  True  False 

13.  The  adjusting  and  closing  entries  at 
the  end  of  a  period  are  recorded  in  the 

General  Journal.  True  False 

14.  Errors  in  addition  or  subtraction  of 
an  account  in  the  Ledger  are  revealed  by  the 

Trial  Balance.  True  False 


15.  It  la  customary  to  rule  the  account  with 
a  charge  customer  when  it  is  in  balance. 

16.  The  exchange  of  one  asset  for  another 
of  equal  value  does  not  affect  the  proprietor- 
ship account. 

17.  The  list  of  merchandise  on  hand  at 
any  time  is  referred  to  as  Merchandise  Inven- 
tory, 

18.  The  journal  is  a  book  of  original 
entry. 

19.  The  number  of  the  page  of  the  Journal 
from  which  a  posting  is  obtained  should  be 
entered  in  the  Ledger. 

20.  All  asset  accounts  have  credit  balances. 

21.  All  purchases  are  recorded  in  the 
Purchases  Journal. 

22.  The  Post-Closing  Trial  Balance  is 
taken  after  the  Ledger  has  been  closed. 

23.  The  account  with  the  proprietor  in 
which  his  withdrawals  of  cash  and  merchandise 
are  recorded  is  called  the  Capital  account. 

24.  The  closing  entries  for  a  business 
are  made  at  the  end  of  each  fiscal  period. 

25.  The  analysis  of  a  business  transaction 
into  its  debit  and  credit  elements  is  called 
journalizing. 

26.  The  Sales  Account  is  classified  as 
an  expense  account. 

27.  The  payment  of  a  note  pa:/able  by  the 
business  decreases  liabilities. 

28.  Return  sales  are  recorded  as  credits 
to  the  Sales  account. 

29.  The  payment  of  rent  is  recorded  in  the 
cash  receipts  side  of  the  Cash  Book. 


True  False 

True  False 

True  False 
True  False 


True 
True 


False 
False 


True  False 
True  False 


True 
True 

True 
True 
True 
True 
True 


False 
False 

False 
False 
False 
False 
False 


30.     A  correcting  entry  is  usually  recorded 
in  the  General  Journal.  True 


31.  All  expenses  are  recorded  in  some  expense 
account  as  credits. 

32.  A  fiscal  period  is  always  one  month  in 
length. 

33.  All  purchases  of  store  fixtures,  desks, 
etc.  are  debited  to  the  Purchases  account. 

34.  The  net  profit  of  a  period  is  the  sum 
of  the  gross  profit  and  the  operating  expenses. 

35.  Increases  in  liabilities  are  credited 
to  the  proper  liability  account. 

36.  The  Balance  Sheet  shows  the  condition 
of  the  business  at  a  definite  time. 

37.  A  debit  balance  in  the  Notes  Receivable 
account  indicates  that  all  notes  received  have 
not  been  paid. 

38.  A  Trial  Balance  of  the  Ledger  taken 
before  the  books  are  closed  does  not  differ 
from  a  Trial  Balance  taken  immediately  after 
the  books  are  closed. 

39.  In  every  business  transaction,  there 
is  an  exchange  of  one  value  for  another. 

40.  The  amount  of  supplies  used  during  a 
period  should  be  entered  in  the  Profit  and 
Loss  Statement  as  an  expense. 

41.  All  sales  on  account  are  recorded  in 
the  Sales  Journal. 

42.  7/hen  a  Note  Receivable  is  paid,  it 
becomes  a  Note  Payable. 

43.  A  separate  posting  to  the  Cash  account 
is  made  for  each  ietm  in  the  Cash  3ook. 

44.  An  increase  in  salaries  paid  decreases 
proprietorshio . 


True 
True 
True 
True 
True 
True 


False 
False 
False 
False 
False 
False 
False 


True  False 


True  False 
True  False 

True  False 
True  False 
True  False 
True  False 
True  False 


45.  Debits  in  the  Journal  are  posted  as 
credits  to  the  Ledger. 

46.  An  additional  investment  "by  the  pro- 
prietor is  debited  to  the  Capital  account. 

47.  The  total  of  the  Purchases  Journal  is 
posted  to  the  credit  of  the  Purchases  account. 

48.  Payment  of  a  household  bill  of  the 
proprietor  should  be  debited  to  the  Drawing 
account . 

49.  All  increases  in  Expenses  are  recorded 
in  appropriate  accounts  as  debits. 

50.  The  information  for  the  Profit  and  Loss 
Statement  is  obtained  from  the  Balance  Sheet 
columns  in  the  Work  Sheet. 

51.  A  personal  account  in  which  the  credit 
side  of  the  account  is  larger  than  the  debit 
side  is  considered  a  liability. 

52.  when  the  debits  equal  the  credits  a 
Trial  Balance  is  said  to  be  out  of  balance. 

53.  The  person  to  whom  merchandise  is  sold 
is  called  a  creditor. 

54.  If  all  the  notes  given  by  the  business 
are  not  paid,  the  Notes  Payable  account  will 
show  a  debit  balance. 

55.  If  the  Notes  Payable  account  shows  a 
credit  balance,  it  indicates  that  not  all  the 
note??  given  or  issued  have  been  paid. 

56.  The  balance  of  the  Sales  account  is 
transferred  to  the  credit  side  of  the  Profit 
and  Loss  Summary  account. 

57.  Increases  in  assets  are  credited  to 
the  proper  asset  account. 

58.  In  ruling  the  Capital  account,  the 
balance  is  brought  down  to  the  credit  side  of 
the  account. 


True  False 
True  False 
True  False 

True  False 
True  False 

True  False 

True  False 
True  False 
True  False 

True  False 

True  False 

True  False 
True  False 

True  False 


59.  The  balance  of  the  Purchases  account  is 
transferred  to  the  debit  side  of  the  Profit  and 
Loss  Summary  account. 

60.  The  difference  between  the  supplies 
inventory  and  the  supplies  account  appears  in 
the  Adjustment  columns  of  the  "ork  Sheet. 

61.  The  debit  side  of  Accounts  Payable 
account  usually  exceed  the  credit  side. 

62.  Only  the  assets,  liabilities,  and 
capital  accounts  appear  in  the  Post-Closing 
Trial  Balance. 

63.  The  change  in  the  as.^et  account  Supplies 
and  Prepaid  Insurance  are  recorded,  daily. 

64.  The  balance  of  the  Depreciation  Reserve 
account  subtracted  from  the  asset  account  gives 
the  book  value  of  the  asset. 

65.  Net  profit  is  posted  to  the  credit  side 
of  the  Capital  account. 

66.  The  excess  of  the  operating  expenses  of 
a  business  over  the  gross  profit  is  net  loss. 

67.  All  cash  sales  are  recorded  in  the 
Sales  Journal. 

68.  The  cash  and  capital  accounts  are 
balanced  at  the  close  of  the  fiscal  period. 

69.  The  account  with  each  charge  customer 
is  credited  with  increases. 

70.  Gross  Profit  minus  Operating  Expense 
equal"  TIet  Profit. 

71.  When  cash  is  the  only  investment,  it  is 
recorded  in  the  General  Journal. 

72.  Written  promises  to  pay  received  from 
others  are  debited  to  the  Notes  Payable  account. 

73.  Expired  Insurance  is  the  difference 
between  Unexpired  Insurance  and  the  Prepaid 
Insurance . 


True  False 

True  False 
True  False 


True 
True 

True 
True 
True 
True 
True 
True 
True 
T  rue 
True 


False 
False 

False 
False 
False 
False 
False 
False 
False 
False 
Fal  s  e 


True  False 


74.  Adjusting  entries  are  made  at  the 

beginning  of  a  fiscal  period.  True  False 

75.  The  amount  of  each  sale  in  the  Sales 
Journal  is  posted  to  the  debit  of  the  customer's 

account  in  the  Ledger.  True  False 

76.  Error  in  posting  to  the  wrong  account 

is  not  revealed  by  the  Trial  Balance.  True  False 

77.  Expired  Insurance  appears  in  the  Balance 

Sheet  Statement.  True  False 

78.  The  cost  of  supplies  used  is  entered  in 

the  Balance  Sheet  columns  of  the  Work  Sheet.         True  False 

79.  Supplies  become  an  expense  when  a  part 

or  all  of  the  supplies  are  used.  True  False 

80.  Depreciation  Reserve  account  usually 

has  a  credit  balance.  True  False 

81.  The  amount  of  insurance  expired  is 
entered  in  the  Profit  and  Loss  columns  of  the 

,7o rk  Sheet.  True  False 

82.  Prepaid  Insurance  account  is  an  expense 

account.  True  False 

83.  The  cost  of  goods  sold  during  a  period  is 
always  the  difference  between  sales  of  that 

period  and  purchases.  True  False 

84.  The  statement  of  assets,  liabilities, 
and  proprietorship  is  called  a  Profit  and  Loss 
Statement.  True  False 

85.  Accounts  with  creditors  are  classified 

as  Accounts  Receivable.  True  False 

86.  At  the  close  of  the  fiscal  period,  the 
ending  Merchandise  Inventory  is  debited  to  the 
Purchases  account.  True  False 


87.     Return  purchases  are  recorded  as  credits 
to  the  Purchases  account.  True  False 


88.  At  the  close  of  a  fiscal  period  all 
income  and  expense  accounts  are  closed  into 
the  Profit  and  Loss  Summary  account. 

89.  All  asset  accounts  in  the  ledger  are 
adjusted  at  the  close  of  the  fiscal  period. 

90.  The  amount  of  fuel  used  during  a  period 
would  appear  in  the  Balance  Sheet  as  a  liabil- 
ity. 

91.  There  is  no  particular  order  as  to  the 
arrangement  of  accounts  in  the  ledger. 

92.  The  total  of  all  debits  as  recorded 
in  the  ledger  accounts  should  equal  the  total 
of  all  the  credits. 

93.  The  beginning  Merchandise  Inventory 
is  debited  to  the  Purchases  account  at  the 
close  of  the  fiscal  period. 

94.  The  balance  of  the  Cash  account  in 
the  ledger  is  brought  down  to  the  credit  side 
of  the  account. 

95.  Depreciation  Expense  account  is  closed 
into  the  credit  side  of  the  Profit  and  Loss 
Summary  account. 

96.  In  closing  the  ledger,  if  the  credits 
in  the  Profit  and  Loss  Summary  account  exceed 
the  debits,  the  difference  is  a  loss. 

97.  Depreciation  Reserve  account  appears 
in  the  Profit  and  Loss  Statement. 

98.  The  U3e  of  several  books  or  journals 

of  original  entry  reduces  the  number  of  postings 
to  be  made . 

99.  The  Cash  Pjook  is  proved  after  posting 
to  the  Ledger. 

100.     A  note  received  in  payment  on  account 
is  entered  in  the  General  Journal. 


True  False 
True  ?alse 

True  False 
True  False 

True  False 

True  False 

True  False 

True  False 

True  False 
True  False 

True  False 
True  False 
True  False 


TABLE  VI 
GROUP  A 


Score 

Table  wi 

th  Intelligence 

Quotient 

ase 

I.Q. 

Score 

Reading 

1-99 

2-100 

Oral 

Score 

1 

103 

88 

88 

2 

94 

80 

80 

3 

100 

78 

72 

4 

108 

78 

72 

5 

105 

96 

84 

6 

109 

76 

76 

7 

95 

78 

72 

8 

102 

74 

82 

9 

101 

72 

70 

10 

108 

80 

84 

11 

119 

82 

82 

12 

101 

88 

92 

13 

103 

86 

94 

14 

110 

92 

88 

15 

98 

82 

94 

16 

107 

76 

76 

17 

110 

86 

82 

18 

106 

82 

90 

19 

115 

82 

78 

20 

84 

60 

74 

21 

105 

70 

86 

22 

111 

74 

84 

23 

98 

90 

88 

24 

105 

76 

76 

25 

104 

68 

78 

26 

104 

72 

78 

27 

92 

62 

66 

28 

102 

70 

78 

29 

110 

72 

68 

30 

102 

84 

PO 

31 

98 

64 

70 

32 

95 

66 

74 

33 

110 

72 

70 

34 

100 

80 

72 

35 

94 

68 

76 

36 

98 

78 

70 

37 

107 

78 

82 

38 

107 

86 

86 

39 

95 

40 

108 

41 

113 

42 

114 

43 

104 

44 

93 

45 

106 

46 

96 

70 

72 

76 

90 

78 

86 

84 

92 

70 

74 

66 

68 

84 

86 

64 

78 

TABLE  VII 
GROUP  B 


Score 

Table  with  Intelligence 

Ouot 1e  nt 

}&se 

nv  A 

1-99 

2-100 



W  JL  d  jL 

TCfiA  f3  1  Tiff 

1 

96 

74 

76 

2 

99 

74 

76 

3 

98 

78 

34 

4 

96 

R6 

80 

5 

99 

80 

70 

6 

108 

74 

82 

7 

96 

64 

74 

8 

99 

78 

68 

9 

115 

76 

82 

10 

105 

86 

94 

11 

95 

76 

78 

12 

91 

72 

80 

13 

96 

68 

72 

14 

96 

80 

84 

15 

90 

74 

68 

16 

104 

76 

80 

17 

104 

74 

76 

18 

92 

82 

94 

19 

102 

88 

82 

20 

113 

86 

82 

21 

81 

64 

68 

22 

108 

84 

90 

23 

100 

82 

80 

24 

106 

84 

92 

25 

95 

78 

80 

26 

105 

82 

88 

27 

96 

34 

88 

28 

90 

86 

62 

29 

102 

73 

100 

30 

105 

80 

84 

31 

104 

94 

98 

32 

97 

86 

88 

33 

111 

86 

88 

34 

89 

82 

78 

35 

112 

88 

90 

36 

101 

84 

80 

37 

98 

68 

70 

38 

105 

68 

72 

39 

94 

32 

92 

40 

105 

68 

72 

41 

83 

76 

74 

42 

115 

90 

94 

43 

104 

72 

58 

44 

100 

80 

72 

45 

119 

86 

94 

46 

99 

78 

76 

TABLE  VIII 

TEST  SCORE  MEANS  and  STANDARD  DEVIATIONS 


KJM  \J  yj 

J.  g 

If  A  n  n 

Sfcfl  n c\  n  r^r)   T5 pv n  "hi  on 

A 

1-99  oral 

77.63 

8.35 

!! 

2-100  reading 

80.35 

7.80 

B 

1-99  oral 

81. 

9.01 

II 

2-100  reading 

79.7 

7.20 

plus  B 

(1-99  oral 
(2-100  oral 

78.67 

8.85 

plus  B 

(1-99  reading 
(2-100  reading 

80.67 

8.45 

TABLE  IX 

INTELLIGENCE  QUOTIENT  MEANS  and  STANDARD  DEVIATIONS 


Group  Mean  Standard  Deviation 

A  103.81  7.75 

B  101.09  8.05 

A  plus  B  102.44  7.75 


STATISTICAL  RESEARCH 


CORRELATION  SCATTER  GRAMS 


DIAGRAM  I 

Correlation  of  Group  /.-Intelligence  Quotient 
with  Oral  Test-Questions  1-99:  r-.49 
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DIAGRAM  II 

Correlation  of  Group  A-Intelligence  Quotient 
with  Reading  Test-Questions  2-100;  r-.38 
Tntf-n  1  wnpfl  Quotient 
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DIAGRAM  III 
Correlation  of  Group  B-Intelligence  Quotient 
with  Oral  Test-Questions  2-100;  r-.34 
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Correlation  of  Group  B-Intelligwce  Quotient 
with  Reading  Test— Questions  1-99;  r.-.45 
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DIAGRAM  V 
Correlation  of  Group  A-Oral  Test-1-99 

with  Reading  Test-Questions  2-100  r-.64 
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DIAGRAM  VI 
Correlation  of  Group  B-Oral  Test-2-100 

with  Reading  Test-Questions  1-99  r-.61 
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DIAGRAM  VII 

Correlation  of  Groups  A  and  B-Intelligence  Quotient 
with  Oral  Test  1-99  and  Oral  Test  2-100  r-.37 
Intelligence  Quotient 
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DIAGRAM  VIII 

Correlation  of  Groups  A  and  B-Intelligence  Quotient 
with  Reading  Test  1-99  and  Reading  "Test  2-100 

r-.38 
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DIAGRAM  IX 

Correlation  of  Groups  A  and  B — Reading  Tests 

1-99  and  2-100  with  Oral  Tests  1-99  and  2-100; 

r-.62 
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